All code snippets on this page will reference the cities dataset we created above, so place that code at the beginning of every code snippet.
Accessing Data in a Dataset
To access the data inside of a dataset, each dataset has a few functions that can be called on to access different parts of the dataset. These are listed in the table below.
|Returns the column at the specified index as a list.|
|Returns the number of columns in the dataset.|
|Returns the index of the column with the name colName.|
|Returns the name of the column at the index colIndex.|
|Returns a list with the names of all the columns.|
|Returns the type of the column at the index.|
|Returns a list with the types of all the columns.|
|Returns the number of rows in the dataset.|
|Returns the value at the specified row and column indexes.|
|Returns the value at the specified row index and column name.|
Looping Through a Dataset
Oftentimes you need to loop through the items in a dataset similar to how you would loop through a list of items. You can use the functions above to do this.
# We use the same cities dataset from above. Using the range function, we can come up with a range of values that represents the number of columns. for row in range(cities.getRowCount()): for col in range(cities.getColumnCount()): print cities.getValueAt(row, col) # Will print out every item in our cities dataset, starting on the first row and moving left to right.
Accessing Data in a PyDataset
PyDatasets can be accessed in the same ways that Datasets can. This means that all of the above functions ( getColumnCount(), getValueAt(), etc ) can be used with PyDatasets too.
PyDatasets are special in that they can be handled similarly to other Python sequences. Any dataset object can be converted to a PyDataset using the function system.dataset.toPyDataSet. All of the functions listed above can be used on a PyDataset, but the data can also be accessed much easier, similar to how you would a list.
# First convert the cities dataset to a PyDataset. pyData = system.dataset.toPyDataSet(cities) # The data can then be accessed using two brackets at the end with row and column indexes. This will print "PST" print pyData
Looping Through a PyDataset
Looping through a PyDataset is also a bit easier to do, working similar to other sequences. The first for loop will pull out each row, which acts like a list and can be used in a second for loop to extract the values.
# Convert to a PyDataset pyData = system.dataset.toPyDataSet(cities) # The for loop pulls out the whole row, so typically the variable row is used. for row in pyData: # Now that we have a single row, we can loop through the columns just like a list. for value in row: print value
Additionally, a single column of data can be extracted by looping through the PyDataset.
# Convert to a PyDataset pyData = system.dataset.toPyDataSet(cities) # Use a for loop to extract out a single row at a time for row in pyData: # Use either the column index or the column name to extract a single value from that row. city = row population = row["Population"] print city, population
A PyRow is a row in a PyDataset. It works similarly to a Python list.
The examples and outputs are based on the results in the table below. In addition, "print" commands are used, but should be replaced by appropriate logging methods (such as system.util.getLogger) depending on the scope of the script.
Returns the index of first occurrence of the element. Returns a ValueError if the element isn't present in the list.
|count()||Calculates total occurrence of given element in the row.|
You can also have repeating elements in a row:
Altering a Dataset
Technically, you cannot alter a dataset. Datasets are immutable, meaning they cannot change. You can, however, create new datasets. To change a dataset, you really create a new one and then replace the old one with the new one. There are system functions that are available that can alter or manipulate datasets in other ways. Any of the functions in the system.dataset section can be used on datasets, the most common ones have been listed below:
The important thing to realize about all of these datasets is that, again, they do not actually alter the input dataset. They return a new dataset. You need to actually use that returned dataset to do anything useful.
For example, the following code is an example of the setValue function, and would change the population value for Los Angeles.
# Create a new dataset with the new value. newData = system.dataset.setValue(cities, 1, "Population", 5000000) # The cities dataset remains unchanged, and we can see this by looping through both datasets.for row in range(cities.getRowCount()): for row in range(cities.getRowCount()): for col in range(cities.getColumnCount()): print cities.getValueAt(row, col) for row in range(newData.getRowCount()): for col in range(newData.getColumnCount()): print newData.getValueAt(row, col)