Submitted by furano on

Some of the goal of the prooject are:

  • providing components that can aggregate on-the-fly storage and metadata farms exposing standard protocols that support redirections and WAN data access
  • making they transparently behave as a unique system, similar to a WAN-wide cluster
  • on-the-fly building the illusion of a unique namespace from a set of distinct storage endpoints or metadata endpoints like DBs
  • read and write support
  • full S3 support, including the hierarchical organization of the files in buckets
  • act as a transparent federating entry point for multiple S3/Azure storage buckets
  • simple and effective VOMS-friendly authorization scheme, compatible with 'normal' Apache authentication plugins (e.g. username/pwd)
  • authorization rules are expandable through customizable Python functions
  • the core of the system is able to accommodate also explicit, catalogue-based indexing and name translation.
  • Very high compatibility with external data management systems to interact with
  • Very high flexibility on naming conventions, algorithmic and non algorithmic translations
  • Very high performance and scalability, even with slow/distant endpoints or external databases.

The more notable examples of suitable protocols are HTTP/WebDAV, S3 and Azure.
The system is generic enough to support others, if a suitable frontend exists.

Documentation

The dynafed documentation is in white paper format:

https://gitlab.cern.ch/lcgdm/dynafed/raw/develop/doc/whitepaper/Doc_DynaFeds.pdf?inline=false

 

Important: System tuning

A Dynafed instance that is intended for production use *must* be run in a properly configured machine.

The default parameters of the operating systems do not allow high scalability nor performance for services like Dynafed.
With respect to this, please refer to the system parameters (file descriptors, processes, Apache HTTPD) that are suggested for the DPM configuration:

https://twiki.cern.ch/twiki/bin/view/DPM/DpmSetupTuningHints

 

Installation

Dynafed is available through the EPEL repository.

yum install dynafed dynafed-http-plugin dynafed-dmlite-frontend memcached

Ensure services come back after a reboot

/sbin/chkconfig httpd on
/sbin/chkconfig memcached on
/sbin/chkconfig php-fpm on

SELinux may prevent httpd from accessing the necessary files. You can either adjust the selinux config appropriately, or turn it off editing /etc/sysconfig/selinux, then reboot (or echo 0 >/selinux/enforce).

Similarly, you may have to adjust the local firewall (typically iptables) to allow traffic in (clients) and out (fed requests).

Give decent parameters to memcached, in /etc/sysconfig/memcached

PORT="11211"
USER="memcached"
MAXCONN="16384"
CACHESIZE="1024"
OPTIONS=""

The principal configuration files are

/etc/ugr/ugr.conf
/etc/ugr/conf.d/endpoints.conf
/etc/httpd/conf.d/zlcgdm-ugr-dav.conf

Before starting, completely wipe the content of the file /etc/httpd/conf.d/zlcgdm-dav.conf if it's present in the directory.

cat /dev/null > /etc/httpd/conf.d/zlcgdm-dav.conf

NB - the file must remain there but empty, so that future rpm upgrades do not replace it.

To set up a basic federation, you just have to add some endpoints to /etc/ugr/conf.d/endpoints.conf (in fact you can name the file whatever you like provided that it ends with .conf).

###########
## Talk to a dCache instance in KIT
##
glb.locplugin[]: /usr/lib64/ugr/libugrlocplugin_dav.so KIT 5 https://f01-060-110-e.gridka.de:2880/pnfs/gridka.de/atlas
locplugin.KIT.ssl_check: false
locplugin.KIT.cli_type:PROXY
locplugin.KIT.cli_certificate: /tmp/proxy
locplugin.KIT.cli_private_key: /tmp/proxy
locplugin.KIT.xlatepfx: /atlasdisks /

Note the requirement for some kind of grid credentials, in this case a proxy, to allow dynafed to access the namespaces of the endpoints it is federating. This credential must be readable by httpd.

Exotic: natively talking to a DPM or LFC database

If you wish your federation to talk directly to a DPM or LFC database, for browsing or name translation purposes, you can install dynafed-dmlite-plugin.

Apache modules configuration

The original dav modules of Apache must not be loaded at startup. To make sure they are not, we have to doublecheck that the corresponding lines are commented out (or absent) in the file /etc/httpd/conf/httpd.conf , like in the following example:

LoadModule mime_module modules/mod_mime.so
#LoadModule dav_module modules/mod_dav.so
LoadModule status_module modules/mod_status.so
LoadModule autoindex_module modules/mod_autoindex.so
LoadModule info_module modules/mod_info.so
#LoadModule dav_fs_module modules/mod_dav_fs.so
LoadModule vhost_alias_module modules/mod_vhost_alias.so

Depending on the flavour of the used platform, the modules to comment out may be in some config file in the directory /etc/httpd/conf.modules.d.

REMINDER: if the python authorization plugin is used,
please make sure that also mod_wsgi is removed from
the initialization of Apache.
Please refer to the full Dynafed white paper for details.

To start dynafed, you need to execute httpd

/etc/init.d/httpd start

Authentication and Authorisation

To require a client certificate

yum install mod_ssl gridsite

then uncomment the SSL section in zlcgdm-ugr-dav.conf. You will need a host certificate (see zlcgdm-ugr-dav.conf) for SSL to work.

You also may want to install the Certification Authorities for Grid usage:

yum install ca-policy-egi-core

that comes from the repository /etc/yum.repos.d/egi-trustanchors.repo

[egi-trustanchors]
gpgcheck=0
protect=0
enabled=1
name=EGI-trustanchors
baseurl=http://repository.egi.eu/sw/production/cas/1/current/

Do you plan your frontend to recognize VOMS proxies, FQANS and WLCG VOs ? Then you may want to install the VOMS configurations for your favourite VOs , e.g. for ATLAS it will be:

yum install wlcg-voms-atlas

These configuration files may come from the WLCG software repository.

Dynafed has several authorization mechanisms that are described in detail in the documentation. As an example, to authorise members of the Atlas VO (for read and list in this case), add the following to ugr.conf.

glb.allowgroups[]: /atlas/* /fed/atlas rl

Validating your installation

Your Dynafed instance should function a lot like a standard HTTP server. You should be able to do the following

  • View your federation via HTTP or HTTPS (according to configuration ) in a browser, typically at https://host.org/myfed/
    • NB - in this case, your VOMS membership isn't available, so you may not be able to access the storage systems
  • Examine the metalink files provided for each replica to check all expected replicas are there (click the metalink icon in the file listing)
  • Access the endpoint using tools such as curl, davix or gfal2
    • curl -L --capath /etc/grid-security/certificates/ --cert /tmp/x509up <endpoint>
    • davix-ls -P grid <endpoint>
    • gfal-ls davs://host.org/myfed
    • Analogous commands for upload (if configured) can also be tried
  • Check /var/log/ugr/ugr.log for evidence of any problems

A working demo federation is usually available for comparison at http://federation.desy.de/myfed/

Support

Please use GGUS or contact dynafed-users-forum (cern.ch).

Preview Repositories

The latest version of dynafed is ni the EPEL repository.

To get release candidate or dev releases of dynafed, follow these pointers.

Candidate release

The packages that are related to the Dynamic Federations are available from the continuous builds by adding the LCGDM and LCGUTIL repositories to YUM.

EL6 LCGDM repo

[lcgdm-cbuilds-el6]
name=LCGDM Continuous Build Repository
baseurl=http://grid-deployment.web.cern.ch/grid-deployment/dms/lcgdm/repos/release-candidate/el6/$basearch
gpgcheck=0
enabled=1
protect=0

 

Continuous Build

The continuous build are the last build available from trunk for the Dynamic Federations project. Install repos file for lcgdm and lcgutil :

https://twiki.cern.ch/twiki/bin/view/DPM/DpmContinuousBuilds

Code

Link to the SVN trunk

https://gitlab.cern.ch/lcgdm/dynafed

Tags