# ion-docval usage instructions

## Introduction

ion-docval is an XML document validator written in Java, which supports validating documents against multiple validation files, and supports multiple validation formats. These formats are:

- XML Schema (.xsd files)
- Schematron (.sch files)
- SVRL Transformation files (.xsl files, but only those for SVRL, i.e. those created from Schematron files)

With ion-docval, you can run a local validation service without having to upload your documents to an online validation tool.

This is the same software that is used in the NPa Peppol Test Tool.


## Installation

Note: this distribution does not include the Java runtime environment, nor any ready-to-use validation files. You will need to download and configure those yourself.

Installation instructions:

0. Download and install the latest Java runtime environment if you do not have one installed. You can 
1. Extract the ion-docval zip file to the directory of your choosing.
2. Go to the extracted directory and copy `sample_config.xml` to `default_config.xml`
3. Edit `default_config.xml`, uncomment or add document types, and update their values as necessary.

See the Configuration section for more details.

## Usage

ion-docval comes with three tools: ion-docval-server, ion-docval-client, and ion-docval-cli.


### ion-docval-server

This is the main tool that performs the validation. When running, it acts as a small webserver that serves a page where you can validate documents.

Given the defaults from the sample configuration file, it will listen on localhost port 35791. You can access it with any webbrowser by visiting the page:

https://localhost:35791/validate

That page contains a single form field, where you can select any .xml document type that matches one of the keywords in the current configuration.

### ion-docval-client

ion-docval-client can send requests to ion-docval-server for a more automated use of the validation service. It sends the given document to a running server, and prints out the result. This tool is intended to be incorporated in scripts or batch jobs, in those cases where ion-docval is not integrated as a library directly.

### ion-docval-cli

This is a standalone command-line version of the validator. It can validate a document against one or more validation files, specified on the command-line. This tool is intended for quickly checking validion files, or in the case where you only need to validate a single file once.


## Performance

Using .sch files directly is easier to set up, as you won't have to convert them to .xsl yourself. However, this does slow down the (re)loading of the validation file a lot. It may take tens of seconds to load a single file, depending on the size of the schematron file. Once loaded, it is as efficient as loading an SVRL .xsl file directly.

## Configuration

### Options section

* AutoReload: [true/false] specifies whether validation files should be automatically reloaded when they change on disk. Note that this may cause an initial delay with the first request after the change, while the validation file is reloaded.
* UnknownKeywords: [warn/error/fail/ignore] specifies what the validator should do when a document is validated with a keyword that is not in the configuration.
  * warn: Return a normal validation result, but add a warning about the unknown keyword. This warning will be the only content of the result
  * error: Return a normal validation result, but add an error about the unknown keyword. This error will be the only content of the result
  * ignore: Return an empty validation result, with 0 errors and 0 warnings
  * fail: Return a hard error that the document cannot be validated. In a web browser, this will be an error alert. On the HTTP level, it will be a 400 client error.
* LazyLoad: [true/false] when true, the validation files will not be loaded into memory until they are first used. This will speed up start time at the cost of initial reaction response time of each document type

### Server section

In the Server Section, you can specify one or more address/port combinations to listen on. The defaults should in general be fine.

Note that it is highly unadvisable to change the IP address to anything other than 127.0.0.1, and especially to any public routable IP address. The internal web server is very minimal and offers not security features such as TLS or IP whitelisting. If you want to run the server validator so that it is reachable from a wider network, we strongly advise to do so through a reverse proxy such as nginx, which can provide these features.

Each Listen entry has the following values:

* Address: [IP address] The IP address to listen on
* Port: [integer] The port number to listen on

### Document type section

This is where you define which types of document ion-docval will validate for you. You can specify as many DocumentType entries as you wish, as long as the Keyword value that is used is unique.

Each entry contains the following elements:

* Name: [string] A user-friendly name for this document type, such as 'SI-UBL 2.0'
* Description: [string] A description of this document type
* Keyword: [string] The keyword by which the server will know which document type a certain document needs to be validated against. See the section Keywords for more information
* ValidationFile: [filename] A validation file that documents of this type should be validated against. The filename must end in either .xsd (for XML Schema files), .sch (for Schematron files), or .xsl (for SVRL stylesheets). You can specify multiple ValidationFile entries for each document type.


### Keywords

Keywords are the way ion-docval-server will choose which set of validation files to use when validating any given document.

If you use the command-line client, or use the jar library directly, you may be in a situation where the caller knows exactly which document type a given document has. In that case, you are free to choose the keyword you wish to use for each document type, as long as every keyword is unique.

For ion-docval-server, or any case where the caller may *not* know exactly which keyword to use, there is a strict process of deriving the keyword from any given document. The keyword you use in your configuration MUST follow this process as well. For convenience, the ion-docval-cli tool provides an option to derive the keyword from a document in the same way that the server will, so you can use that value in your configuration.

The format depends on whether or not the document is a UBL document, a CII document, or any other XML document, and comprises up to 4 elements:

1. namespace: The XML namespace of the root element (if any)
2. root element: The tag name of the root element (e.g. Invoice)
3. Customization ID: The customization ID for UBL, or the GuideLineSpecifiedDocumentContextParameter in case of CII
4. version: The UBL version in case of UBL, or D16B in case of CII.

Depending on the general document type, this makes the following formats:
* UBL: `<namespace>::<root element>##<customization id>::<version>`
* CII: `<namespace>::<root element>##<customization id>::D16B`
* Any other xml with namespaces: `<namespace>::<root element>`
* Any other xml without namespaces: `<root element>`

When using the command-line client ion-docval-client, or when calling the library directly, you can specify a specific keyword of your own choosing, as long as it matches the correct keyword from the configuration.
