This HOW-TO is not yet complete and should be considered a draft. For background information and details see the thesis of Hugo Buddelmeijer.
For now it is required to wrap a SourceList in a SourceCollectionWrapper class in order to use it with in the SourceCollections. Ultimately, the SourceCollection classes will be better integrated with the SourceList and other *List classes.
from astro.main.SourceList import SourceList from astro.main.sourcecollection.SourceCollection \ import SourceCollection from astro.main.sourcecollection.SourceListWrapper \ import SourceListWrapper # Fetch a SourceList sl = (SourceList.SLID == 1575051)[0] # Create a SourceCollection from the SourceList slw = SourceListWrapper() slw.set_sourcelist(sl) # The SourceCollection can be made persistent if wanted. slw.commit() print slw.SCID # 100511
# Continue with 'slw' from above. # Define a criterion to select the required sources. query = ' "DEC" < 11 ' # Define the requested attributes. # In this case a comoving distance and the inverse concentration index. attributes = ['R', 'iC'] # Pull a SourceCollection that has the requested data. # # The information system will search for existing SourceCollections # that can be used to fulfill the request. New SourceCollections # are created if no suitable ones are found. scnew = slw.derive(attributes=attributes, query=query) # New SourceCollections are created to be as general as possible # in order to facilitate reuse. In this case, the SourceCollection # calculating 'R' will be created to calculate the attribute for all the # sources of slw, not only for the sources with a declination below 11. # Check whether the SourceCollection has been processed. scnew.is_made() # If False is returned, the SourceCollection has to be processed. scnew.make() # The is_made() and make() functions only apply to the parts of the # dependency tree that is required to build the target SourceColloction. # In this case, 'R' will only be calculated for the sources with DEC < 11.
# Load the catalog data into a TableConverter. scnew.load_data() # Interface with the catalog data. print scnew.data.attribute_order # ['SLID', 'SID', 'R', 'iC'] print scnew.data.attributes['R'] # {'length': 1, 'ucd': '', 'null': '', 'name': 'R', 'format': 'float64'} print scnew.data.data['R'] # [ 562.54038472 1905.82573128 397.30116968 ..., 12.93537333 # Or send the catalog data over Samp. from astro.services.samp.Samp import Samp s = Samp() s.broadcast(scnew) # Highlight or select and broadcast some sources in Topcat or Aladin. # Retrieve the SLIDs/SIDs or the row in the TableConverter. s.highlightedSource(scnew) # (1575051L, 812L) s.selectedSources(scnew) # [(1575051L, 843L), (1575051L, 847L), (1575051L, 848L)] s.highlighted(scnew) # 812
# Continue with 'slw' from before. # Pull attributes for a subset of the sources. scnew1 = slw.derive(query=' "DEC" < 11 ', attributes=['R']) scnew1.is_made() # Should return True because this part of the SourceCollection has already # been processed in the previous example, if not, make it: scnew1.make() # Pull attributes for a smaller subset of the sources scnew2 = slw.derive(query=' "DEC" < 10 ', attributes=['R']) scnew2.is_made() # True, because this has already been processed. # Pull data for a larger subset of the sources scnew3 = slw.derive(query=' "DEC" < 12 ', attributes=['R']) scnew3.is_made() # False, because this part of the SourceCollection # has not been processed completely yet.
# Continue with the datasets from above. # Commit the SourceCollection that need to be saved. This will recursively # commit the dependency tree as well. slnew2.commit() # Store the source data that has been calculated. This will store the catalog data # in the most optimal way. Although slnew2 only represents a subset of the # sources, all the calculated attributes are stored: The 'R' value for the sources # in scnew1 will be stored as well. slnew2.store_data() # The demo session ends here.
It is preferable to use the data pulling functions and let the information system determine what needs to be created automatically. This is less work and facilitates reuse.
Nonetheless, not all SourceCollections can be created through pulling data, therefore it is useful to know how to create them manually.
# astro.main classes from astro.main.SourceList import SourceList # Virtual base SourceCollection from astro.main.sourcecollection.SourceCollection \ import SourceCollection # All derived operator SourceCollections from astro.main.sourcecollection.SourceListWrapper \ import SourceListWrapper from astro.main.sourcecollection.FilterSources \ import FilterSources from astro.main.sourcecollection.SelectSources \ import SelectSources from astro.main.sourcecollection.SelectAttributes \ import SelectAttributes from astro.main.sourcecollection.ConcatenateAttributes \ import ConcatenateAttributes from astro.main.sourcecollection.AttributeCalculator \ import AttributeCalculator, AttributeCalculatorDefinition
# Fetch a SourceList sl = (SourceList.SLID == 1575051)[0] # Create a SourceCollection from the SourceList slw = SourceListWrapper() slw.set_sourcelist(sl)
# Find all AttributeCalculatorDefinition objects that can be used to # calculate comoving distances. acdsR = AttributeCalculatorDefinition.get_acds_by_attribute('R') # Pick the first one. acdR = acdsR[0] acdR.name 'Comoving Distance Calculator' # See which attributes are required by the ACD. acdR.input_attribute_names ['redshift', 'RA', 'DEC', 'HTM'] # Which are al available in 'slw': [a in slw.get_attribute_names() for a in acdR.input_attribute_names] [True, True, True, True] # The required attributes have to be selected with a SelectAttributes: sa1 = SelectAttributes() sa1.parent_collection = slw sa1.selected_attributes = ['redshift', 'RA', 'DEC', 'HTM'] # And the AttributeCalculator can be initialized. The 'AC' property of an ACD # object is class derived from the AttributeCalculator class which used this # definition. ac1 = acdR.AC() ac1.parent_collection = sa1
Similarly for the inverse concentration:
acdsiC = AttributeCalculatorDefinition.get_acds_by_attribute('iC') acdiC = acdsiC[0] all([a in slw.get_attribute_names() for a in acdiC.input_attribute_names]) sa2 = SelectAttributes() sa2.parent_collection = slw sa2.selected_attributes = [a for a in acdiC.input_attribute_names] ac2 = acdiC.AC() ac2.parent_collection = sa2
# A FilterSources represents a subset of the parent SourceCollection # by evaluation of a selection criterion. fs = FilterSources() fs.parent_collection = slw fs.set_query(' "DEC" < 11 ') # Alternatively, a SelectSources SourceCollection can be used to select # a subset that is explicitly listed by another SourceCollection. # First create a SourceCollection with only source identifiers. sa3 = SelectAttributes() sa3.parent_collection = fs # Use this to specify the selected sources of a SelectSources SourceCollection ss = SelectSources() ss.parent_collection = slw ss.selected_sources = fs
# First select no attributes from the FilterSources sa3 = SelectAttributes() sa3.parent_collection = fs # Select the comoving distance sa4 = SelectAttributes() sa4.parent_collection = ac1 sa4.selected_attributes = ['R'] # Select the inverse concentration sa5 = SelectAttributes() sa5.parent_collection = ac2 sa5.selected_attributes = ['iC'] # Combine all the attributes ca = ConcatenateAttributes() ca.parent_collections = [sa3, sa4, sa5] # And we're done scnew = ca scnew.make() scnew.load_data()
from astro.main.AssociateList import AssociateList from astro.main.sourcecollection.SourceCollection \ import SourceCollection from astro.main.sourcecollection.RelabelSources \ import RelabelSources from astro.main.sourcecollection.ConcatenateSources \ import ConcatenateSources from astro.main.sourcecollection.RenameAttributes \ import RenameAttributes # Fetch a SourceCollection and AssociateList sc = (SourceCollection.SCID == 100161)[0] al = (AssociateList.ALID == 472781)[0] # Create a RelabelSources rs = RelabelSources() rs.parent_collection = sc rs.associatelist = al # 'rs' and 'sc' now represent 'different' sources, which # can be concatenated. (This is not really useful though.) cs = ConcatenateSources() cs.parent_collections = [sc, rs] # And the attributes can be renamed. ra = RenameAttributes() ra.parent_collection = cs ra.attributes_old = ['MAG_ISO', 'MAGERR_ISO'] ra.attributes_new = ['MAG_B', 'MAGERR_B']
from astro.main.sourcecollection.SourceCollection \ import SourceCollection from astro.main.sourcecollection.SourceCollectionTree \ import SourceCollectionTree # Same start as above. slw = (SourceCollection.SCID == 100511)[0] query = ' "DEC" < 11 ' attributes = ['R', 'iC'] # Create a SourceCollectionTree from the SourceCollection. The SCT traverses # the dependency tree and keeps the progenitors of the given SC in memory. # A Pass SC is created with the given SC as parent, which is used as the end # node of the tree. Later functions of the SCT will replace SCs in the tree, # but never this end node. sct = SourceCollectionTree(slw) # Apply the selection criterion. There are two things that can happen: # 1) The SCT discovers an existing SC that represents the requested sources # and will create a SelectSources SC to select the requested sources. # 2) The SCT cannot find suitable SCs and creates a new FilterSources SC with # the end node as parent and the given selection criterion as query. The # query is not evaluated; the exact composition of sources is unknown. # In both cases, the created SC is placed between the Pass node at the end of # the tree and its parent. sct.apply_filter(query=query) # Select the attributes. For every attribute the SCT will search for existing # SCs that represent the attribute for the requested set of sources. A # hierarchy of SelectAttributes and ConcatenateAttributes SCs is created that # provides the right attributes. The SCT will try to instantiate new # AttributeCalculator SCs if there are no existing SCs that contain a # requested attribute. sct.apply_attribute_selection(attributes) # The end node of the tree now represents the requested catalog. This is # a Pass SC, so its parent is returned by the .derive() function. scnew = sct.sourcecollection.parent_collection
sct = SourceCollectionTree(scnew) sct.make_dot_graph('howtotree1')
# Continuing with 'scnew' from earlier # Calling is_made without optimization will return True because scnew is a # ConcatenateAttributes SC, which does not have to be processed. scnew.is_made(optimize=False) # Calling is_made with optimization is equal to the following: # Create a SourceCollectionTree sct = SourceCollectionTree(scnew) # Optimize the tree for loading catalog data, by placing the selection of # sources and attribute early in the tree. sct.optimize_for_load() # Now the entire optimized tree can be checked. sct.is_made() # This will recursively call 'is_made(optimize=False)' on the SCs in the # optimized tree. # Similarly, calling 'scnew.make()' is identical to: sct = SourceCollectionTree(scnew) sct.optimize_for_load() sct.make() # This will recursively call 'make(optimize=False)' on the SCs in the # optimized tree that are not yet made. # Finally, 'scnew.load_data()' is identical to: sct = SourceCollectionTree(scnew) sct.optimize_for_load() sct.load_data() # This will load the catalog data of the end node of the tree in the # most optimal way. It will not necessarily load the catalog data # for all the other SCs in the tree.
# Store the catalog data in an optimized way. slnew.store_data() # This is equivalent to sct = SourceCollectionTree(slnew) sct.store_data()
# Start with a SourceCollection. slw = (SourceCollection.SCID == 100511)[0] # Create an AttributeCalculator to calculate 'R' without lamdba. acd = AttributeCalculatorDefinition.get_acds_by_attribute('R')[0] ac2 = acd.AC() ac2.parent_collection = slw ac2.set_process_parameter('omega_m', 1.0) ac2.set_process_parameter('omega_l', 0.0) # Create an SCT and manually ensure that the relevant SCs are tracked. sct = SourceCollectionTree(slw) sct.track_children_auto(cache=True) sct.track_tree(ac2) # Search for the comoving distance. An AC with the wrong omega_m is selected. sc1=sct.find_attribute('R') print "#sc1 omega_m", sc1.get_process_parameter('omega_m') #sc1 omega_m 0.3 # Define a new key function to find the correct AC. from astro.main.sourcecollection.SourceCollectionTree \ import key_find_attribute def mykey(scd): # First retrieve the default ranking. tkey = key_find_attribute(scd) # The 'scd' is a dictionary, the key 'sc' points to the actual SC. sc = scd['sc'] # Reduce the key value of SCs that are not an AttributeCalculator if not isinstance(sc, AttributeCalculator): tkey -= 10**9 # and reduce the key value of the ACs that use another omega_m. elif sc.get_process_parameter('omega_m') != 1.0: tkey -= 10**9 return tkey # Set the new key function. SourceCollectionTree.key_functions['find_attribute'] = mykey # Search for the comoving distance again, the preferred AC is found. sc2=sct.find_attribute('R') print "#sc2 omega_m", sc2.get_process_parameter('omega_m') #sc2 omega_m 1.0
The code of an AttributeCalculatorDefinition is stored in a file on the dataserver. This (python) file contains a new AttributeCalculator class that is derived from the one in code base. The create_from_file() method of the AttributeCalculatorDefinition class can be used to create a new definition from this class. Auxiliary files can be used by wrapping everything in a tarball.
The procedure for this is too long to to list in this document, see demo 17 for an example.
cd $AWEPIPE/astro/experimental/SourceCollection/demos/demo17
The SAMP interaction is described in the http://www.astro-wise.org/portal/howtos/man_howto_samp/man_howto_samp.shtmlSAMP how-to
The Query Driven Visualization is described in the http://www.astro-wise.org/portal/howtos/man_howto_qdv/man_howto_qdv.shtmlSAMP how-to