Reference for CMS-Specific Modules

Modules that are for use in CMS production. Each sub-module should stay relatively independent of each other since they each do dramatically different things.

checkphedex.py

A module that provides functions to check the comparison results to the list of files and deletions in PhEDEx.

author:Daniel Abercrombie <dabercro@mit.edu>
dynamo_consistency.cms.checkphedex.check_datasets(site, orphan_list_file)[source]

Checks PhEDEx exhaustively to see if a dataset should exist at a site, according to PhEDEx, but has files marked as orphans according to our check. The number of filereplicas for each dataset is printed to the terminal. Datasets that contain any filereplicas are returned by this function.

Parameters:
  • site (str) – The name of the site to check
  • orphan_list_file (list) – List of LFNs that are listed as orphans at the site
Returns:

The list of number of files and datasets for each dataset that is supposed to have at least 1 file at the site.

Return type:

list of tuples

dynamo_consistency.cms.checkphedex.deletion_requests(site)[source]

Get a list of datasets with approved deletion requests at a given site that were created within the number of days matching the IgnoreAge configuration parameter. This request is done via the PhEDEx deleterequests API.

Parameters:site (str) – The site that we want the list of deletion requests for.
Returns:Datasets that are in deletion requests
Return type:set
dynamo_consistency.cms.checkphedex.get_files(site, dataset)[source]

Get the list of file replicas at a site for a given dataset. This is done via the PhEDEx filereplicas API.

Parameters:
  • site (str) – The name of the site to check
  • dataset (str) – The name of the dataset to check
Returns:

A list of files at the site for a given dataset

Return type:

list

filedumps.py

A module to handle file dumps from sites

class dynamo_consistency.cms.filedumps.LineReader[source]

A callable object that translates lines from a file dump. It tracks the time that it was initialized.

dynamo_consistency.cms.filedumps.read_ral_dump(endpoint, datestring=None)[source]

Copies file from remote site and lists

Parameters:
  • endpoint (str) – The SE to copy the file dump from
  • datestring (str) – An optional datestring to force source file name
Returns:

A tuple of the filename and translator

Return type:

tuple

filters.py

This module defines any filters that are used specifically for CMS.

class dynamo_consistency.cms.filters.DatasetFilter(datasets)[source]

Filter to check if files are in the CMS-style datasets

Parameters:datasets (set) – Set (or other collection) of datasets using CMS notation that we want to identify files as part of
protected(file_name)[source]

Returns whether the file is in a stored dataset. If the file name is not structured in a way to get the dataset out, then this function chooses to filter it out.

Parameters:file_name (str) – Full LFN of file
Returns:If file belongs to a dataset that is stored
Return type:bool

unmerged.py

A module for handling the listing and cleaning of /store/unmerged

dynamo_consistency.cms.unmerged.clean_unmerged(site)[source]

Lists the /store/unmerged area of a site, and then uses cmstoolbox.unmergedcleaner.listdeletable to list files to delete and adds them to the registry.

Warning

This function has a number of side effects to various module configurations. Definitely call this after running the main site consistency.

Parameters:site (str) – The site to run the check over
Returns:The number of files entered into the register and the number that are log files
Return type:int, int
dynamo_consistency.cms.unmerged.report_contents(timestamp, site, files)[source]

Creates a SQLite3 database that contains all of the files and directories in a list. This database is then copied to the WebDir with the name SITE_unmerged.db.

Parameters:
  • timestamp (int) – Time that the listing was done
  • site (str) – Used mostly for naming the database
  • files (list) – List of files to put in the database