Running Over CMS Sites

Configuration

A configuration file should be created before pointing to it, like above. The configuration file for Site Consistency is a JSON or YAML file with the following keys

  • AccessMethod - A dictionary of access methods for sites. Sites default to XRootD, but setting a value of SRM causes the site to be listed by gfal-ls commands.

  • AdditionalLogDeletions - A dictionary that lists which directories have logs to be cleaned for different sites. These log directories are treated the same as log directories in /store/unmerged. This means they use the UnmergedLogsAge parameter to determine cleaning policy.

  • DirectoryList - A list of directories inside of RootPath to check consistency.

  • DeleteOrphans - By default, is true. If set to false, orphan files will all be filtered out so that none are deleted.

  • FreeMem - The amount of free memory that is required before dynamo-consistency tries to run a check. The memory is given in GBs.

  • GFALThreads - The number of threads used by the GFAL listers

  • GlobalRedirectors - The redirectors to start all locate calls from, unless looking for a site that is listed in the Redirectors configuration.

  • IgnoreAge - Ignore any files or directories with an age less than this, in days.

  • IgnoreDirectories - The check ignores any paths that contain any of the strings in this list.

  • InventoryAge - The age, in days, of how old the information from the inventory can be

  • ListAge - The age, in days, of how old the list of files directly from the site can be

  • ListDeletable - Configuration for unmerged cleaning “listdeletable” module. Details on some of the configuration parameters are here.

  • MaxMissing - If more files than this number are missing, then there will be no automatic entry into the register.

  • MaxOrphan - If more than files than this number are orphan files at a site, then there will be no automatic entry into the register.

  • NumThreads - The number of threads used by the XRootD listers

  • PathPrefix - A dictionary of prefixes to place before RootPath in the XRootD call. If the prefix is not set for a site, and it fails to list RootPath, it uses the default /cms.

  • RedirectorAge - The age, in days, of how old the information on doors from redirectors can be. If this value is set to zero, the redirector information is never refreshed.

  • Redirectors - A dictionary with keys of sites with hard-coded redirector locations. If a site is not listed in this way, the redirector is found by matching domains from CMSToolBox.siteinfo.get_domain() to redirectors found in a generic xrdfs locate call.

  • Retries - Number of retries after timeouts to attempt

  • RootPath - The directory where all of the listed subdirectories will be under. For CMS sites, this will be "/store"

  • SaveCache - If set and evaluates to True, copies old cached directory trees instead of overwriting

  • Timeout - This gives the amount of time, in seconds, that you want the listing to try to run on a single directory before it times out.

  • Unmerged - A list of sites to handle cleaning of /store/unmerged on. If the list is empty, all the sites are managed centrally

  • UnmergedLogsAge - The minimum age of the unmerged logs to be deleted, in days

  • UseLoadBalancer - A list of sites where the main redirector of the site is used

  • UseTransferQueue - If true, put missing files into tranfer queue table when using --v1 for reporting. Defaults to true value.

  • VarLocation - The location for the varying directory. Inside this directory will be:

    - Logs
    - Redirector lists
    - Cached trees
    - Lock files
    
  • WebDir - The directory where text files and the sqlite3 database live

Configuration parameters can also be quickly overwritten for a given run by setting an environment variable of the same name.

Production Settings

The configuration in production copied to the summary website whenever it changes. That would be the best place to see the production settings. Navigate to the relative location consistency_config.json. For the current dynamo production server, that would be at http://dynamo.mit.edu/consistency/consistency_config.json

Comparison Script

Note

The following script description was last updated on April 11, 2018.

The production script, located at dynamo_consistency/prod/compare.py at the time of writing, goes through the following steps for each site.

  1. Points config.py to the local consistency_config.json file
  2. Notes the time, and if it’s daylight savings time for entry into the summary database
  3. Reads the list of previous missing files, since it requires a file to be missing on multiple runs before registering it to be copied
  4. It gathers the inventory tree by calling dynamo_consistency.getinventorycontents.get_db_listing().
  5. Creates a list of datasets to not report missing files in. This list consists of the following.
    • Deletion requests fetched from PhEDEx by dynamo_consistency.checkphedex.set_of_deletions()
  6. It creates a list of datasets to not report orphans in. This list consists of the following.
    • Datasets that have any files on the site, as listed by the dynamo MySQL database
    • Deletion requests fetched from PhEDEx (same list as datasets to skip in missing)
    • Any datasets that have the status flag set to 'IGNORED' in the dynamo database
    • Merging datasets that are protected by Unified
  7. It gathers the site tree by calling dynamo_consistency.getsitecontents.get_site_tree(). The list of orphans is used during the running to filter out empty directories that are reported to the registry during the run.
  8. Does the comparison between the two trees made, using the configuration options listed under Configuration concerning file age.
  9. If the number of missing files is less than MaxMissing, the number of orphans is less than MaxOrphan, and the site is under the webpage’s “Debugged sites” tab, connects to a dynamo registry to report the following errors:
    • Every orphan file and every empty directory that is not too new nor should contain missing files is entered in the deletion queue.
    • For each missing file, every possible source site as listed by the dynamo database, (not counting the site where missing), is entered in the transfer queue. Creates a text file full of files that only exist elsewhere on tape.
  10. Creates a text file that contains the missing blocks and groups.
  11. .txt file lists and details of orphan and missing files are moved to the web space
  12. If the site is listed in the configuration under the Unmerged list, the unmerged cleaner is run over the site:
    • dynamo_consistency.getsitecontents.get_site_tree() is run again, this time only over /store/unmerged
    • Empty directories that are not too new nor protected by Unified are entered into the deletion queue
    • The list of files is passed through the Unmerged Cleaner
    • The list of files to delete from Unmerged Cleaner are entered in the deletion queue
  13. The summary database is updated to show the last update on the website

Automatic Site Selection

To automatically run prod/compare.py over a few well-deserving sites, use prod/run_checks.sh.

Manually Setting XRootD Doors

In addition to the Redirectors key in the configuration file, which sets the redirector for a site, there is also a mechanism for setting all the doors for a site. A list of possible doors can be found at <VarLocation>/redirectors/<SiteName>_redirector_list.txt. Any url in that list that matches the domain of the site will be used to make xrootd calls. To add or remove urls from this list, just add or remove lines from this file.

Note

If the RedirectorAge configuration parameter is not set to 0, then this redirector list will be overwritten once it becomes too old. To force the generation of a new list when the RedirectorAge is set to 0, simply delete the redirector list file for that site.

A list of redirectors found by the global redirectors is stored in <VarLocation>/redirectors/redirector_list.txt.