aisdb.database.decoder module

Parsing NMEA messages to create an SQL database. See function decode_msgs() for usage

class aisdb.database.decoder.FileChecksums(*, dbconn)[source]

Bases: object

checksum_exists(checksum)[source]
checksums_table()[source]

instantiates new database connection and creates a checksums hashmap table if it doesn’t exist yet.

creates a temporary directory and saves path to self.tmp_dir

creates SQLite connection attribute self.dbconn, which should be closed after use

e.g.

self.dbconn.close()

get_md5(path, f)[source]

get md5 hash from the first kilobyte of data

insert_checksum(checksum)[source]
aisdb.database.decoder.decode_msgs(filepaths, dbconn, source, vacuum=False, skip_checksum=False, verbose=True)[source]

Decode NMEA format AIS messages and store in an SQLite database. To speed up decoding, create the database on a different hard drive from where the raw data is stored. A checksum of the first kilobyte of every file will be stored to prevent loading the same file twice.

If the filepath has a .gz or .zip extension, the file will be decompressed into a temporary directory before database insert.

Parameters:
  • filepaths (list) – absolute filepath locations for AIS message files to be ingested into the database

  • dbconn (aisdb.database.dbconn.DBConn) – database connection object

  • source (string) – data source name or description. will be used as a primary key column, so duplicate messages from different sources will not be ignored as duplicates upon insert

  • vacuum (boolean, str) – if True, the database will be vacuumed after completion. if string, the database will be vacuumed into the filepath given. Consider vacuuming to second hard disk to speed this up

Returns:

None

example:

>>> import os
>>> from aisdb import decode_msgs, DBConn
>>> filepaths = ['aisdb/tests/testdata/test_data_20210701.csv',
...              'aisdb/tests/testdata/test_data_20211101.nm4']
>>> with SQLiteDBConn('test_decode_msgs.db') as dbconn:
...     decode_msgs(filepaths=filepaths, dbconn=dbconn,
...                 source='TESTING', verbose=False)
aisdb.database.decoder.decoder(dbpath, psql_conn_string, files, source, verbose)

Parse NMEA-formatted strings, and create databases from raw AIS transmissions

Parameters:
  • dbpath (str) – Output SQLite database path. Set this to an empty string to only use Postgres

  • psql_conn_string (str) – Postgres database connection string. Set this to an empty string to only use SQLite

  • files (array of str) – array of .nm4 raw data filepath strings

  • source (str) – data source text. Will be used as a primary key index in database

  • verbose (bool) – enables logging

Returns:

None

aisdb.database.decoder.fast_unzip(zipfilenames, dirname, processes=12)[source]

unzip many files in parallel any existing unzipped files in the target directory will be skipped