Reference

The following is a full reference to the submodules inside of the dynamo_consistency module.

config.py

Small module to get information from the config.

author:Daniel Abercrombie <dabercro@mit.edu>
dynamo_consistency.config.DIRECTORYLIST = None

If this is set to a list of directories, it overrides the DirectoryList set in the configuration file. This prevents the tool from attempting to list directories that are not there.

dynamo_consistency.config.LOADER = <module 'json' from '/usr/lib/python2.7/json/__init__.pyc'>

A module that uses the load function on a file descriptor to return a dictionary. (Examples are the json and yaml modules.) If your LOCATION is not a JSON file, you’ll want to change this also before calling config_dict().

dynamo_consistency.config.LOCATION = 'consistency_config.json'

The string giving the location of the configuration JSON file. Generally, you want to set this value of the module before calling config_dict() to get your configuration.

dynamo_consistency.config.SITE = None

A global place that stores a site that has been picked. Set in dynamo_consistency.picker.pick_site().

dynamo_consistency.config.config_dict()[source]

This only loads the configuration file the first time it is called

Returns:the configuration file in a dictionary
Return type:str
Raises:IOError – when it cannot find the configuration file
dynamo_consistency.config.vardir(directory)[source]

Gets the full path to a sub directory inside of VarLocation and creates an empty directory if needed.

Parameters:directory (str) – A desired sub-directory
Returns:Path to configured sub-directory
Return type:str

datatypes.py

Module defines the datatypes that are used for storage and comparison. There is also a powerful create_dirinfo function that takes a filler function or object and uses the multiprocessing module to recursively list directories in parallel.

author:Daniel Abercrombie <dabercro@mit.edu>
exception dynamo_consistency.datatypes.BadPath[source]

An exception for throwing when the path doesn’t make sense for various methods of a DirectoryInfo

class dynamo_consistency.datatypes.DirectoryInfo(name='', directories=None, files=None)[source]

Stores all of the information of the contents of a directory

Parameters:
  • name (str) – The name of the directory
  • directories (list) – If this is set, the infos in the list are merged into a master DirectoryInfo.
  • files (list) – List of tuples containing information about files in the directory.
add_file_list(file_infos)[source]

Add a list of tuples containing file_name, file_size to the node. This is most useful when you get a list of files from some other source and want to easily convert that list into a DirectoryInfo()

Parameters:file_infos (list) – The list of files (full path, size in bytes[, timestamp])
add_files(files)[source]

Set the files for this DirectoryInfo node

Parameters:files (list) – The tuples of file information. Each element consists of file name, size, and mod time.
Returns:self for chaining calls
Return type:DirectoryInfo
compare(other, path='', check=None)[source]

Does one way comparison with a different tree

Parameters:
  • other (DirectoryInfo) – The directory tree to compare this one to
  • path (str) – Is the path to get to this location so far
  • check (function) – An optional function that double checks a file name. If the checking function returns True for a file name, the file will not be included in the output.
Returns:

Tuple of list of files and directories that are present and not in the other tree and the size of the files that corresponds to

Return type:

list, list, long

count_nodes(empty=False)[source]
Parameters:empty (bool) – If True, only return the number of empty nodes
Returns:The total number of nodes in this Directory Info. This corresponds to approximately the number of listing requests required to build the data.
Return type:int
display(path='')[source]

Print out the contents of this DirectoryInfo

Parameters:path (str) – The full path to this DirectoryInfo instance
displays(path='')[source]

Get the string to print out the contents of this DirectoryInfo.

Parameters:path (str) – The full path to this DirectoryInfo instance
Returns:The display string
Return type:str
empty_nodes_list()[source]

This function should be used to get the nodes to delete in the proper order for non-recursive deletion

Returns:The list of empty directories to delete in the order to delete
Return type:list
empty_nodes_set()[source]

This function recursively builds the entire list of empty directories that can be deleted

Returns:The set of empty directories to delete
Return type:set
get_directory_size()[source]

Report the total size used by this directory and its subdirectories.

Returns:Size of files in directory, in bytes
Return type:int
get_file(file_name)[source]

Get the file dictionary based off the name.

Parameters:file_name (str) – The LFN of the file
Returns:Dictionary of file information
Return type:dict
Raises:BadPath – if the file_name does not start with self.name
get_files(min_age=0, path='')[source]

Get the list of files that are older than some age

Parameters:
  • min_age (int) – The minimum age, in seconds, of files to list
  • path (str) – The path to this file. Used for recursive calls
Returns:

List of full file paths

Return type:

list

get_node(path, make_new=True)[source]

Get the node that corresponds to the path given. If the node does not exist yet, and make_new is True, the node is created.

Parameters:
  • path (str) – Path to the desired node from current node. If the path does not exist yet, empty nodes will be created.
  • make_new (str) – Bool to create new node if none exists at path or not
Returns:

A node with the proper path, unless make_new is False and the node doesn’t exist

Return type:

DirectoryInfo or None

get_num_files(unlisted=False, place_new=False)[source]

Report the total number of files stored.

Parameters:
  • unlisted (bool) – If true, return number of unlisted directories, Otherwise return only successfully listed files
  • place_new (bool) – If true, pretend there’s one more file inside any new directory or if files is None. This prevents listing of empty directories to include directories that should not actually be deleted.
Returns:

The number of files in the directory tree structure

Return type:

int

get_unlisted(path='')[source]
Parameters:path (str) – Path to prepend to the name, used in recursive calls
Returns:List of directories that were unlisted
Return type:list
listdir(*args, **kwargs)[source]

Get the list of directory names within a DirectoryInfo. Adding an argument will display the contents of the next directory. For example, if dir.listdir() returns:

0: data
1: mc

dir.listdir(1) then lists the contents of mc and dir.listdir(1, 0) lists the contents of the first subdirectory in mc.

Parameters:
  • args – Is a list of indices to list the subdirectories
  • kwargs – Supports ‘printing’ which is set to a bool. Defaults as True.
Returns:

The DirectoryInfo that is being listed

Return type:

DirectoryInfo

remove_node(path_name)[source]

Remove an empty node from the DirectoryInfo

Parameters:

path_name (str) – The path to the node, including the self.name at the beginning

Returns:

self for chaining

Return type:

DirectoryInfo

Raises:
  • NotEmpty – if the directory is not empty or self.files is None
  • BadPath – if the path_name does not start with the self.name
save(file_name)[source]

Save this DirectoryInfo in a file.

Parameters:file_name (str) – is the location to save the file
setup_hash()[source]

Set the hashes for this DirectoryInfo

exception dynamo_consistency.datatypes.NotEmpty[source]

An exception for throwing when a non-empty directory is deleted from a DirectoryInfo

dynamo_consistency.datatypes.compare(inventory, listing, output_base=None, orphan_check=None, missing_check=None)[source]

Compare two different trees and output the differences into an ASCII file

Parameters:
  • inventory (DirectoryInfo) – The tree of files that should be at a site
  • listing (DirectoryInfo) – The tree of files that are listed remotely
  • output_base (str) – The names of the ASCII files to place the reports are generated from this variable.
  • orphan_check (function) – A function that double checks each expected orphan. The function takes as an input, an LFN. If the function returns true, the LFN will not be listed as an orphan.
  • missing_check (function) – A function checks each expected missing file The function takes as an input, an LFN. If the function returns true, the LFN will not be listed as missing.
Returns:

The two lists, missing and orphan files

Return type:

tuple

dynamo_consistency.datatypes.get_info(file_name)[source]

Get the DirectoryInfo from a file.

Parameters:file_name (str) – is the location of the saved information
Returns:Saved info
Return type:DirectoryInfo

remotelister.py

Tool to get the files located at a site.

author:

Daniel Abercrombie <dabercro@mit.edu>

Max Goncharov <maxi@mit.edu>

dynamo_consistency.remotelister.listing(site, callback=None, **kwargs)[source]

Get the information for a site, from XRootD or a cache.

Parameters:
  • site (str) – The site name
  • callback (function) – The callback function to pass to create.create_dirinfo()
Returns:

The site directory listing information

Return type:

dynamo_consistency.datatypes.DirectoryInfo

inventorylister.py

This module gets the information from the inventory about a site’s contents

author:Daniel Abercrombie <dabercro@mit.edu>
dynamo_consistency.inventorylister.filter_files(site, pathstrip)[source]

Gets the files from the inventory and filters them through the configuration’s DirectoryList

Parameters:
  • site (str) – The site to get the files from
  • pathstrip (int) – The length of the root node’s name that is stripped from the directory name for filtering
Returns:

Tuples for adding to dynamo_consistency.datatypes.DirectoryInfo.add_file_list()

Return type:

generator

dynamo_consistency.inventorylister.listing(site, callback=None, **kwargs)[source]

Get the list of files from the inventory.

Parameters:site (str) – The name of the site to load
Returns:The file replicas that are supposed to be at a site
Return type:dynamo_consistency.datatypes.DirectoryInfo