.. index:: encoder Encoders ======== This section gives an overview of encoders, details on the encoders that ship with libxo, and documentation for developers of future encoders. Overview -------- The libxo library contains software to generate four "built-in" formats: text, XML, JSON, and HTML. These formats are common and useful, but there are other common and useful formats that users will want, and including them all in the libxo software would be difficult and cumbersome. To allow support for additional encodings, libxo includes a "pluggable" extension mechanism for dynamically loading new encoders. libxo-based applications can automatically use any installed encoder. Use the "encoder=XXX" option to access encoders. The following example uses the "cbor" encoder, saving the output into a file:: df --libxo encoder=cbor > df-output.cbor Encoders can support specific options that can be accessed by following the encoder name with a colon (':') or a plus sign ('+') and one of more options, separated by the same character:: df --libxo encoder=csv+path=filesystem+leaf=name+no-header df --libxo encoder=csv:path=filesystem:leaf=name:no-header These examples instructs libxo to load the "csv" encoder and pass the following options:: path=filesystem leaf=name no-header Each of these option is interpreted by the encoder, and all such options names and semantics are specific to the particular encoder. Refer to the intended encoder for documentation on its options. The string "@" can be used in place of the string "encoder=". df --libxo @csv:no-header .. _csv_encoder: CSV - Comma Separated Values ---------------------------- libxo ships with a custom encoder for "CSV" files, a common format for comma separated values. The output of the CSV encoder can be loaded directly into spreadsheets or similar applications. A standard for CSV files is provided in :RFC:`4180`, but since the format predates that standard by decades, there are many minor differences in CSV file consumers and their expectations. The CSV encoder has a number of options to tailor output to those expectations. Consider the following XML:: % list-items --libxo xml,pretty GRO-000-415 gum 1412 54 10 HRD-000-212 rope 85 4 2 HRD-000-517 ladder 0 2 1 This output is a list of `instances` (named "item"), each containing a set of `leafs` ("sku", "name", etc). The CSV encoder will emit the leaf values in this output as `fields` inside a CSV `record`, which is a line containing a set of comma-separated values:: % list-items --libxo encoder=csv sku,name,sold,in-stock,on-order GRO-000-415,gum,1412,54,10 HRD-000-212,rope,85,4,2 HRD-000-517,ladder,0,2,1 Be aware that since the CSV encoder looks for data instances, when used with :ref:`xo`, the `--instance` option will be needed:: % xo --libxo encoder=csv --instance foo 'The {:product} is {:status}\n' stereo "in route" product,status stereo,in route .. _csv_path: The `path` Option ~~~~~~~~~~~~~~~~~ By default, the CSV encoder will attempt to emit any list instance generated by the application. In some cases, this may be unacceptable, and a specific list may be desired. Use the "path" option to limit the processing of output to a specific hierarchy. The path should be one or more names of containers or lists. For example, if the "list-items" application generates other lists, the user can give "path=top/data/item" as a path:: % list-items --libxo encoder=csv:path=top/data/item sku,name,sold,in-stock,on-order GRO-000-415,gum,1412,54,10 HRD-000-212,rope,85,4,2 HRD-000-517,ladder,0,2,1 Paths are "relative", meaning they need not be a complete set of names to the list. This means that "path=item" may be sufficient for the above example. .. _csv_leafs: The `leafs` Option ~~~~~~~~~~~~~~~~~~ The CSV encoding requires that all lines of output have the same number of fields with the same order. In contrast, XML and JSON allow any order (though libxo forces key leafs to appear before other leafs). To maintain a consistent set of fields inside the CSV file, the same set of leafs must be selected from each list item. By default, the CSV encoder records the set of leafs that appear in the first list instance it processes, and extract only those leafs from future instances. If the first instance is missing a leaf that is desired by the consumer, the "leaf" option can be used to ensure that an empty value is recorded for instances that lack a particular leaf. The "leafs" option can also be used to exclude leafs, limiting the output to only those leafs provided. In addition, the order of the output fields follows the order in which the leafs are listed. "leafs=one.two" and "leafs=two.one" give distinct output. So the "leafs" option can be used to expand, limit, and order the set of leafs. The value of the leafs option should be one or more leaf names, separated by a period ("."):: % list-items --libxo encoder=csv:leafs=sku.on-order sku,on-order GRO-000-415,10 HRD-000-212,2 HRD-000-517,1 % list-items -libxo encoder=csv:leafs=on-order.sku on-order,sku 10,GRO-000-415 2,HRD-000-212 1,HRD-000-517 Note that since libxo uses terminology from YANG (:RFC:`7950`), the data modeling language for NETCONF (:RFC:`6241`), which uses "leafs" as the plural form of "leaf". libxo follows that convention. .. _csv_no_header: The `no-header` Option ~~~~~~~~~~~~~~~~~~~~~~ CSV files typical begin with a line that defines the fields included in that file, in an attempt to make the contents self-defining:: sku,name,sold,in-stock,on-order GRO-000-415,gum,1412,54,10 HRD-000-212,rope,85,4,2 HRD-000-517,ladder,0,2,1 There is no reliable mechanism for determining whether this header line is included, so the consumer must make an assumption. The csv encoder defaults to producing the header line, but the "no-header" option can be included to avoid the header line. .. _csv_no_quotes: The `no-quotes` Option ~~~~~~~~~~~~~~~~~~~~~~ :RFC:`4180` specifies that fields containing spaces should be quoted, but many CSV consumers do not handle quotes. The "no-quotes" option instruct the CSV encoder to avoid the use of quotes. .. _csv_dos: The `dos` Option ~~~~~~~~~~~~~~~~ :RFC:`4180` defines the end-of-line marker as a carriage return followed by a newline. This `CRLF` convention dates from the distant past, but its use was anchored in the 1980s by the `DOS` operating system. The CSV encoder defaults to using the standard Unix end-of-line marker, a simple newline. Use the "dos" option to use the `CRLF` convention. The Encoder API --------------- The encoder API consists of three distinct phases: - loading the encoder - initializing the encoder - feeding operations to the encoder To load the encoder, libxo will open a shared library named: ${prefix}/lib/libxo/encoder/${name}.enc This file is typically a symbolic link to a dynamic library, suitable for `dlopen`(). libxo looks for a symbol called `xo_encoder_library_init` inside that library and calls it with the arguments defined in the header file "xo_encoder.h". This function should look as follows:: int xo_encoder_library_init (XO_ENCODER_INIT_ARGS) { arg->xei_version = XO_ENCODER_VERSION; arg->xei_handler = test_handler; return 0; } Several features here allow for future compatibility: the macro XO_ENCODER_INIT_ARGS allows the arguments to this function change over time, and the XO_ENCODER_VERSION allows the library to tell libxo which version of the API it was compiled with. The function places in xei_handler should be have the signature:: static int test_handler (XO_ENCODER_HANDLER_ARGS) { ... This function will be called with the "op" codes defined in "xo_encoder.h". Each op code represents a distinct event in the libxo processing model. For example OP_OPEN_CONTAINER tells the encoder that a new container has been opened, and the encoder can behave in an appropriate manner.