soundDENA

soundDENA is a Python library for easily and precisely accessing various sorts of natural sounds data in bulk. It allows you to treat the NSNSD hierarchical file structure almost as though it were a queryable database.

SoundDENA:

  • Provides accessor methods which handle and normalize inconsistencies in naming and file formats: input data may differ, but the output is consistent
  • Associates metadata with data, so data can be selected based on a query of its metadata
  • Returns data in pandas structures, which excel at concise and efficient subselection, querying, aggregation, and general wrangling
  • Plays nicely with the Python scientific computing ecosystem
  • Handles missing data without making a fuss

A taste of soundDENA:

In [1]: import soundDENA

In [2]: query = 'unit == "DENA" and type == "Grid" and not winter_site'

In [3]: denaSummerGrid = soundDENA.metadata.query(query)

In [4]: srcids = soundDENA.srcid.all(denaSummerGrid)

In [5]: srcids
Out[5]: 
                                      len    ...        tagDate
siteID                                       ...               
DENABEAR2007 2007-06-14 00:46:15 00:05:39    ...     1970-01-01
             2007-06-14 01:39:11 00:05:07    ...     1970-01-01
             2007-06-14 05:42:34 00:02:25    ...     1970-01-01
             2007-06-14 05:45:00 00:03:39    ...     1970-01-01
...                                   ...    ...            ...
DENAWFYR2006 2006-08-09 12:32:35 00:04:31    ...     1970-01-01
             2006-08-09 08:15:47 00:06:39    ...     1970-01-01
             2006-08-09 03:16:22 00:04:20    ...     1970-01-01
             2006-08-09 02:02:57 00:02:30    ...     1970-01-01

[18301 rows x 10 columns]

# Total length of noise events, by vehicle type:
In [6]: srcids.groupby("srcID")["len"].sum()
Out[6]: 
srcID
0.0    0 days 00:00:00
1.0    0 days 00:00:46
1.1   14 days 08:04:05
1.2   30 days 07:15:00
1.3    1 days 06:28:35
2.0    4 days 07:43:36
Name: len, dtype: timedelta64[ns]

Contents: