1.8

Table Of Contents

Settings pane

Settings for the data source and a list of Data Samples and JavaScript files used in the current

data mapping configuration, can be found on the Settings tab at the left. The available options

depend on the type of data sample that is loaded.

The Input Data settings (especially Delimiters) and Boundaries are essential to obtain the data

and eventually, the output that you need. For more explanation, see "Data source settings" on

page115.

Input Data

The Input Data settings specify how the input data must be interpreted. These settings are

different for each data type. For a CSV file, for example, it is important to specify the delimiter

that separates data fields. PDF files are already delimited naturally by pages, so the input data

settings for PDF files are interpretation settings for text in the file.

CSV file Input Data settings

In a CSV file, data is read line by line, where each line can contain multiple fields. The input

data settings specify to the DataMapper module how the fields are separated.

Field separator: Defines what character separates each field in the file. Even though

CSV stands for comma-separated values, CSV can actually refer to files where fields are

separated using any character, including commas, tabs, semicolons, and pipes.

Text delimiter: Defines what character surrounds text in the file, preventing the Field

separator from being interpreted within those text delimiters. This ensures that, for

example, the field “Smith; John” is not interpreted as two fields, even if the field delimiter

is the semicolon.

Comment delimiter: Defines what character starts a comment line.

Encoding: Defines what encoding is used to read the Data Source (US-ASCII, ISO-

8859-1, UTF-8, UTF-16, UTF-16BE or UTF-16LE ).

Lines to skip: Defines a number of lines in the CSV that will be skipped and not used as

records.

Set tabs as a field separator: Overwrites the Field separator option and sets the Tab

character instead for tab-delimited files.

First row contains field names: Uses the first line of the CSV as headers, which

automatically names all extracted fields.

Page 203