Base
Base abstractions of Parser and Converters.
Parser
parses a given document type into a ParsedDocumentType
.
Converter
converts a given document from one format to another so that it can be compatible with an existing Parser
.
Example: We have a Parser
that parses a .pdf document, we can have Converter
s that convert docx, pptx into pdf.
Converter
Bases: ABC
Source code in supermat/core/parser/base.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
convert(file_path)
abstractmethod
Converts input file to another file type and saves it. The saved file path is returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
Path
|
Input file. |
required |
Returns:
Name | Type | Description |
---|---|---|
Path |
Path
|
Output file after conversion. |
Source code in supermat/core/parser/base.py
33 34 35 36 37 38 39 40 41 42 43 |
|
Parser
Bases: ABC
Source code in supermat/core/parser/base.py
18 19 20 21 22 23 24 25 26 27 28 29 |
|
parse(file_path)
abstractmethod
Parse give file to ParsedDocumentType.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
Path
|
Input file. |
required |
Returns:
Name | Type | Description |
---|---|---|
ParsedDocumentType |
ParsedDocumentType
|
Parsed document |
Source code in supermat/core/parser/base.py
19 20 21 22 23 24 25 26 27 28 29 |
|