Some of the goal of the prooject are:
- providing components that can aggregate on-the-fly storage and metadata farms exposing standard protocols that support redirections and WAN data access
- making they transparently behave as a unique system, similar to a WAN-wide cluster
- on-the-fly building the illusion of a unique namespace from a set of distinct storage endpoints or metadata endpoints like DBs
- read and write support
- full S3 support, including the hierarchical organization of the files in buckets
- act as a transparent federating entry point for multiple S3/Azure storage buckets
- simple and effective VOMS-friendly authorization scheme, compatible with 'normal' Apache authentication plugins (e.g. username/pwd)
- authorization rules are expandable through customizable Python functions
- the core of the system is able to accommodate also explicit, catalogue-based indexing and name translation.
- Very high compatibility with external data management systems to interact with
- Very high flexibility on naming conventions, algorithmic and non algorithmic translations
- Very high performance and scalability, even with slow/distant endpoints or external databases.
The more notable examples of suitable protocols are HTTP/WebDAV, S3 and Azure.
The system is generic enough to support others, if a suitable frontend exists.
Documentation
The dynafed documentation is in white paper format:
https://gitlab.cern.ch/lcgdm/dynafed/raw/develop/doc/whitepaper/Doc_DynaFeds.pdf?inline=false
Important: System tuning
A Dynafed instance that is intended for production use *must* be run in a properly configured machine.
https://twiki.cern.ch/twiki/bin/view/DPM/DpmSetupTuningHints
Installation
Dynafed is available through the EPEL repository.
yum install dynafed dynafed-http-plugin dynafed-dmlite-frontend memcached
Ensure services come back after a reboot
/sbin/chkconfig httpd on /sbin/chkconfig memcached on /sbin/chkconfig php-fpm on
SELinux may prevent httpd from accessing the necessary files. You can either adjust the selinux config appropriately, or turn it off editing /etc/sysconfig/selinux, then reboot (or echo 0 >/selinux/enforce).
Similarly, you may have to adjust the local firewall (typically iptables) to allow traffic in (clients) and out (fed requests).
Give decent parameters to memcached, in /etc/sysconfig/memcached
PORT="11211" USER="memcached" MAXCONN="16384" CACHESIZE="1024" OPTIONS=""
The principal configuration files are
/etc/ugr/ugr.conf /etc/ugr/conf.d/endpoints.conf /etc/httpd/conf.d/zlcgdm-ugr-dav.conf
Before starting, completely wipe the content of the file /etc/httpd/conf.d/zlcgdm-dav.conf if it's present in the directory.
cat /dev/null > /etc/httpd/conf.d/zlcgdm-dav.conf
NB - the file must remain there but empty, so that future rpm upgrades do not replace it.
To set up a basic federation, you just have to add some endpoints to /etc/ugr/conf.d/endpoints.conf (in fact you can name the file whatever you like provided that it ends with .conf).
########### ## Talk to a dCache instance in KIT ## glb.locplugin[]: /usr/lib64/ugr/libugrlocplugin_dav.so KIT 5 https://f01-060-110-e.gridka.de:2880/pnfs/gridka.de/atlas locplugin.KIT.ssl_check: false locplugin.KIT.cli_type:PROXY locplugin.KIT.cli_certificate: /tmp/proxy locplugin.KIT.cli_private_key: /tmp/proxy locplugin.KIT.xlatepfx: /atlasdisks /
Note the requirement for some kind of grid credentials, in this case a proxy, to allow dynafed to access the namespaces of the endpoints it is federating. This credential must be readable by httpd.
Exotic: natively talking to a DPM or LFC database
If you wish your federation to talk directly to a DPM or LFC database, for browsing or name translation purposes, you can install dynafed-dmlite-plugin.
Apache modules configuration
The original dav modules of Apache must not be loaded at startup. To make sure they are not, we have to doublecheck that the corresponding lines are commented out (or absent) in the file /etc/httpd/conf/httpd.conf , like in the following example:
LoadModule mime_module modules/mod_mime.so #LoadModule dav_module modules/mod_dav.so LoadModule status_module modules/mod_status.so LoadModule autoindex_module modules/mod_autoindex.so LoadModule info_module modules/mod_info.so #LoadModule dav_fs_module modules/mod_dav_fs.so LoadModule vhost_alias_module modules/mod_vhost_alias.so
Depending on the flavour of the used platform, the modules to comment out may be in some config file in the directory /etc/httpd/conf.modules.d.
REMINDER: if the python authorization plugin is used, please make sure that also mod_wsgi is removed from the initialization of Apache. Please refer to the full Dynafed white paper for details.
To start dynafed, you need to execute httpd
/etc/init.d/httpd start
Authentication and Authorisation
To require a client certificate
yum install mod_ssl gridsite
then uncomment the SSL section in zlcgdm-ugr-dav.conf. You will need a host certificate (see zlcgdm-ugr-dav.conf) for SSL to work.
You also may want to install the Certification Authorities for Grid usage:
yum install ca-policy-egi-core
that comes from the repository /etc/yum.repos.d/egi-trustanchors.repo
[egi-trustanchors] gpgcheck=0 protect=0 enabled=1 name=EGI-trustanchors baseurl=http://repository.egi.eu/sw/production/cas/1/current/
Do you plan your frontend to recognize VOMS proxies, FQANS and WLCG VOs ? Then you may want to install the VOMS configurations for your favourite VOs , e.g. for ATLAS it will be:
yum install wlcg-voms-atlas
These configuration files may come from the WLCG software repository.
Dynafed has several authorization mechanisms that are described in detail in the documentation. As an example, to authorise members of the Atlas VO (for read and list in this case), add the following to ugr.conf.
glb.allowgroups[]: /atlas/* /fed/atlas rl
Validating your installation
Your Dynafed instance should function a lot like a standard HTTP server. You should be able to do the following
- View your federation via HTTP or HTTPS (according to configuration ) in a browser, typically at https://host.org/myfed/
- NB - in this case, your VOMS membership isn't available, so you may not be able to access the storage systems
- Examine the metalink files provided for each replica to check all expected replicas are there (click the metalink icon in the file listing)
- Access the endpoint using tools such as curl, davix or gfal2
- curl -L --capath /etc/grid-security/certificates/ --cert /tmp/x509up <endpoint>
- davix-ls -P grid <endpoint>
- gfal-ls davs://host.org/myfed
- Analogous commands for upload (if configured) can also be tried
- Check /var/log/ugr/ugr.log for evidence of any problems
A working demo federation is usually available for comparison at http://federation.desy.de/myfed/
Support
Please use GGUS or contact dynafed-users-forum (cern.ch).
Preview Repositories
The latest version of dynafed is ni the EPEL repository.
To get release candidate or dev releases of dynafed, follow these pointers.
Candidate release
The packages that are related to the Dynamic Federations are available from the continuous builds by adding the LCGDM and LCGUTIL repositories to YUM.
EL6 LCGDM repo
[lcgdm-cbuilds-el6] name=LCGDM Continuous Build Repository baseurl=http://grid-deployment.web.cern.ch/grid-deployment/dms/lcgdm/repos/release-candidate/el6/$basearch gpgcheck=0 enabled=1 protect=0
Continuous Build
The continuous build are the last build available from trunk for the Dynamic Federations project. Install repos file for lcgdm and lcgutil :
https://twiki.cern.ch/twiki/bin/view/DPM/DpmContinuousBuilds
Code
Link to the SVN trunk