fastavro.read¶
- class reader(fo, reader_schema=None, return_record_name=False)¶
Iterator over records in an avro file.
- Parameters
fo (file-like) – Input stream
reader_schema (dict, optional) – Reader schema
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import reader with open('some-file.avro', 'rb') as fo: avro_reader = reader(fo) for record in avro_reader: process_record(record)
The fo argument is a file-like object so another common example usage would use an io.BytesIO object like so:
from io import BytesIO from fastavro import writer, reader fo = BytesIO() writer(fo, schema, records) fo.seek(0) for record in reader(fo): process_record(record)
- metadata¶
Key-value pairs in the header metadata
- codec¶
The codec used when writing
- writer_schema¶
The schema used when writing
- reader_schema¶
The schema used when reading (if provided)
- class block_reader(fo, reader_schema=None, return_record_name=False)¶
Iterator over
Block
in an avro file.- Parameters
fo (file-like) – Input stream
reader_schema (dict, optional) – Reader schema
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
from fastavro import block_reader with open('some-file.avro', 'rb') as fo: avro_reader = block_reader(fo) for block in avro_reader: process_block(block)
- metadata¶
Key-value pairs in the header metadata
- codec¶
The codec used when writing
- writer_schema¶
The schema used when writing
- reader_schema¶
The schema used when reading (if provided)
- class Block(bytes_, num_records, codec, reader_schema, writer_schema, named_schemas, offset, size, return_record_name=False)¶
An avro block. Will yield records when iterated over
- num_records¶
Number of records in the block
- writer_schema¶
The schema used when writing
- reader_schema¶
The schema used when reading (if provided)
- offset¶
Offset of the block from the begining of the avro file
- size¶
Size of the block in bytes
- schemaless_reader(fo, writer_schema, reader_schema=None, return_record_name=False)¶
Reads a single record writen using the
schemaless_writer()
- Parameters
fo (file-like) – Input stream
writer_schema (dict) – Schema used when calling schemaless_writer
reader_schema (dict, optional) – If the schema has changed since being written then the new schema can be given to allow for schema migration
return_record_name (bool, optional) – If true, when reading a union of records, the result will be a tuple where the first value is the name of the record and the second value is the record itself
Example:
parsed_schema = fastavro.parse_schema(schema) with open('file', 'rb') as fp: record = fastavro.schemaless_reader(fp, parsed_schema)
Note: The
schemaless_reader
can only read a single record.
- is_avro(path_or_buffer)¶
Return True if path (or buffer) points to an Avro file. This will only work for avro files that contain the normal avro schema header like those create from
writer()
. This function is not intended to be used with binary data created fromschemaless_writer()
since that does not include the avro header.- Parameters
path_or_buffer (path to file or file-like object) – Path to file