Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Thin, schema-first and robust data-interchange object format for Internet
This document aims to provide the Internet Object 1.0 specification and showcase various aspects of the subject.
Author and Researcher
Mohamed Aamir Maniar at ManiarTech®️ Lab
Contact
Version
1.0
Status
Work-in-Progress
Draft
Website
(WIP)
Docs
Last Updated
27th February 2025
The Internet Object format is a document-oriented format that emphasizes the separation of header and data. This structure is similar to that of HTML, and MIME, where the header is kept separate from the data or body.
In an Internet Object document, the header is optional but can be used to define schemas and definitions. The data section always starts with the ---
separator. This separator is the first element of the data section and is mandatory to distinguish it from the header.
[ Internet Object Document Structure Diagram ]
If an Internet Object document includes both a header and a data section you can call it a full document.
When an Internet Object document contains only a data section, it is okay to omit the ---
separator. Such documents are sent to the server without any header because the schema is either not required or already known to the recipient.
With Separator:
Without Separator:
In many cases, a query-generating document may not yield any results. In such cases, you can use the header with result metadata to send the query and the results. However, it is important to include the ---
separator to mark the end of the header and the start of the data section.
Internet Object document can contain multiple data sections. This facility allows user to provide multiple types of data collection to be embedded in the single document.
Internet Object document structure is designed to be simple and flexible. The next section will discuss the Header and Data section in detail.
This poem encapsulates the core guiding principles that shape the design and objectives of the Internet Object format.
The poem serves as a unique and memorable medium to communicate the foundational principles that form the basis of the Internet Object format. By creatively capturing these concepts in verse, the poem enables readers to appreciate and remember the essence of the data interchange system more effectively. The Internet Object format is designed to be efficient, clear, and versatile in facilitating data interchange, and the poem highlights these attributes by artistically expressing the philosophy and goals that drive its development. The poem, therefore, not only adds an engaging element to the specification but also reinforces the core values of the Internet Object format.
Size holds weight, in bytes confined, Small prevails, large left behind.
Simplicity shines over complexity's shroud, Readability echoes, accurate and loud.
Reusability births productivity's rise, Verbosity's burden efficiency defies.
Data, definitions, separate ways, Together they clutter, apart they amaze.
Headers and data, distinctions drawn, Confusion dissolves, clarity's dawn.
Errors and statuses, data's divide, Their entanglement brings chaos inside.
Two lone records, states unswayed, No interference, connections unmade.
Trust not the sender, vigilance displayed, Expect the unanticipated, foundations laid.
Surprises, enchanting, yet beware, Not all of them good, handle with care.
Internet Object is a data interchange format designed for modern network communication. This specification introduces Internet Object as a text-based, schema-first, document-oriented, and streamable format that prioritizes human readability and language independence. Internet Object aims to optimize the serialization of structured data for efficient transmission between servers and clients across the internet.
Internet Object (IO) is a document-oriented data serialization format designed to optimize data transmission over networks. This specification introduces IO as an alternative to existing formats such as JSON, offering a structured approach to data representation and exchange.
The fundamental structure of IO is an ordered collection of values, analogous to CSV (Comma-Separated Values) but with extended capabilities. These capabilities include support for nested objects, arrays, and inline keys, providing enhanced expressiveness and flexibility.
Document-Oriented Design: In contrast to value-oriented formats, IO adopts a document-centric approach, facilitating the separation of data from definitions to enhance clarity and maintainability.
Ordered Collection with Extended Functionality: IO's core structure maintains an ordered collection of values while supporting complex data structures such as nested objects and arrays.
Schema-First Approach: IO emphasizes schema-first design to ensure data consistency and predictability. While schemas are optional, their inclusion significantly enhances data integrity and validation.
Concise Syntax: The syntax of IO is optimized for readability and efficiency, minimizing data size without compromising clarity.
Metadata Integration: IO documents can incorporate metadata, variables, and multiple schemas within the header section, providing comprehensive context for the data.
The following example demonstrates a basic IO document structure:
This structure illustrates IO's concise syntax and inherent schema support. For comparison, an equivalent JSON representation would be:
IO supports collections and various data types, as demonstrated in the following example:
This example illustrates several key features:
Explicit data type definitions in the schema (string
, int
, bool
)
Nested object structures (address
)
Collection of objects denoted by the tilde (~
) prefix
Correspondence between the order of values and the schema definition
The equivalent JSON representation would be:
This comparison demonstrates IO's capacity to represent structured data collections efficiently, offering a compact and readable format while maintaining an ordered structure.
In many scenarios, it's beneficial to define schemas separately from the data. This approach allows for schema reuse, versioning, and easier maintenance. Here's an example of a separate schema followed by a document using that schema:
Separate Schema (person.io)
Document with Collection and Metadata
In this example:
The schema is defined separately, potentially in a file named "person.io".
The document references the schema URL in its metadata.
The document includes additional metadata such as record count and pagination information.
The collection contains multiple records, each prefixed with ~
.
Each record follows the structure defined in the schema, including an array of skills.
This structure allows for efficient data transmission, as the schema only needs to be sent once and can be cached by the receiving system. It also facilitates easy updates to the schema without necessarily changing the data format.
Internet Object represents a significant advancement in data serialization technology. By combining the simplicity of ordered collections with the robustness of schema-based validation, Internet Object offers a powerful yet accessible solution for modern data exchange needs. Its key strengths include:
Efficiency in data transmission and storage
Clarity through its schema-first approach and document-oriented design
Flexibility in handling various data structures and types
Compatibility with existing JSON-based systems
These attributes make Internet Object suitable for a wide range of applications, from web-based and networked environments to data storage and interchange in diverse domains such as IoT, cloud computing, and enterprise systems.
The subsequent sections of this specification provide comprehensive details on Internet Object's syntax, schema definition language, supported data types, and advanced features. This information will enable developers, system architects, and data engineers to fully leverage the capabilities of Internet Object in their projects and applications.
As data exchange continues to play a crucial role in our interconnected world, Internet Object stands poised to address current challenges and anticipate future needs in data serialization and transmission.
The inception of the Internet Object began as a side project aimed at addressing the limitations observed in the JSON format. Over time, it evolved into an independent research endeavor, focusing on effectively tackling data-transfer challenges such as size, schema validation, data streaming, header, and metadata support, among others. The design of the Internet Object format revolves around the following key objectives:
To optimize the format for internet wire transfer, the Internet Object must be conceived and developed without being excessively influenced by existing mechanisms. However, it may draw inspiration from other formats as needed.
Internet Object documents must be text-based, human-friendly, and easy to work with. Developers should be able to write these documents using plain text IDEs without needing any frameworks, libraries, or utilities.
To ensure a small footprint, the Internet Object format should separate data and schema, allowing data to be sent alone over the network.
To uphold data integrity during wire transfer, the Internet Object format should prioritize a schema-first approach.
Embracing a comprehensive document-oriented approach, the Internet Object format should facilitate the bundling of all essential components - including records, data, definitions, schemas, and comments - within a single document. This approach ensures that all related information is conveniently stored together, promoting the efficient organization and streamlined management of data and its associated elements. Moreover, it enhances maintainability and simplifies collaboration among team members, as they can easily access and understand the complete context within a single, unified document.
The Internet Object must support complex data types so that any kind of data whether large number or complex data structure can be easily serialized and deserialized for the wire.
The Internet Object format should support the streaming of independent records, allowing for efficient and continuous data transfer. With this feature, the failure of a single record will not affect the processing of other records, ensuring more resilient data transmission.
The Internet Object format should be designed to work seamlessly across different platforms, operating systems, and programming languages. This universal compatibility ensures broad adoption and versatility, enabling developers to easily integrate the format into their projects without limitations.
By providing support for inline comments, the Internet Object format allows users to document schemas and definitions directly within the data itself. This feature enhances readability and maintainability, making it easier for users to understand and manage complex data structures.
To increase the format's adaptability, the Internet Object should promote reusability through concepts like references and variables. This dynamic feature allows for the customization of data structures and enables users to manipulate data more effectively, catering to various needs and scenarios.
The header of an Internet Object document is positioned at the beginning of the document and serves a crucial role in defining the schema or associated definitions for the data it contains. This section includes essential metadata, context, variables, and schema references for the document's content. It plays an important role in ensuring that the data is presented in a consistent format and provides the necessary information for accurate interpretation and processing.
[ Header Image Placeholder ]
In this schema example, five keys are defined with additional details:
name
: Represents a standard key, expected to contain a value such as a string.
age:int
: Specifies that the age
key should contain an integer value, indicating the data type explicitly.
address
: Another standard key, which could hold a more complex value like a string or an object, depending on the context.
isActive?
: The question mark (?
) signifies that the isActive
key is optional, meaning it may or may not be present in the data.
remark
: Represents a standard key, expected to contain a value, likely a string, which could hold additional comments or notes.
This schema not only defines the structure but also includes type annotations and optionality, enhancing the clarity and robustness of the data model. By using this schema, the document can ensure consistent and accurate data representation, making it easier to process and interpret across different systems.
Definitions, at their core, are collections of key-value pairs used to declare metadata, variables, complex schemas, and other key-value pairs within the header of an Internet Object document.
In this example, the header contains response metadata and schema details presented as Definitions, rather than using a Default Schema as seen in the previous example. The Definitions provide metadata that specify the page size (pageSize
), the current page number (currentPage
), and the total record count (recordCount
). Additionally, more complex structures are defined, such as an address schema ($address
) with nested keys (street
, city
, state
) and a higher-level schema ($schema
) that references both simple and complex data types. The $schema
is a reserved key used to define the default schema for the document.
The schema is a fundamental component of the Internet Object format, defining the structure and semantics of the data within an Internet Object document. When the header contains only a schema, it is referred to as a "default schema." This schema is typically used to outline the structure of the data included in the document, separating the structure definition from the data itself. This separation makes the data more compact, readable, and easier to process. For more detailed information about schemas, refer to .
See this page for more information about .
For further information about , click the link.
Special characters are used in conjunction with structural characters and literals to provide additional functionality or context within an Internet Object document. The following are the special characters used in the Internet Object format.
?
Question Mark
U+003F
Shortcut for declaring optional member when suffixed to the key name in object schema
*
Asterisk
U+002A
Shortcut for declaring nullable member when suffixed to the key name in object schema. Also used to make schema accept undeclared variables.
-
Hyphen / Minus
U+002D
Represents negative value
+
Plus
U+002B
Represents positive value
The Internet Object format includes several structural charactersm, literals and other special characters that are used to structure and delimit data within a document. These characters are used in conjunction with objects, strings, arrays, numbers, and whitespace to create complex and flexible data structures.
In the Internet Object format, whitespace refers to any character with a Unicode code point less than or equal to U+0020
(i.e., characters in the range U+0000
to U+0020
). This range includes both non-printable control characters and common whitespace characters such as the horizontal tab (U+0009
), newline (U+000A
), vertical tab (U+000B
), form feed (U+000C
), carriage return (U+000D
), and space (U+0020
).
In addition to the characters in the range U+0000
to U+0020
, the Internet Object format also includes characters in the Unicode whitespace category as whitespace. This includes characters such as the non-breaking space (U+00A0
), em space (U+2003
), and en space (U+2002
), among others. Including Unicode whitespace characters can make it easier to work with text in languages that use non-Latin scripts, such as Arabic, Chinese, or Japanese.
It's also worth noting that the Internet Object format recognizes the zero-width non-breaking space (U+FEFF
) as whitespace. This character is often used as a byte order mark (BOM) in Unicode-encoded documents. Incorporating a more comprehensive range of whitespace characters in Internet Object offers several advantages that can make the format easier to work with, more readable, and more compatible with different systems and programming languages.
The following table lists the valid whitespace characters:
U+0000
to U+0020
Space, Line Feed, Carriage Return, Tab, Bell, etc.
Any character having charCode <=0x20
such as space. Includes ASCII space and control characters.
U+1680
Ogham Space Mark
Space used in Ogham scripts.
U+2000
En Quad
Space equal to the width of the lowercase letter "n".
U+2001
Em Quad
Space equal to the width of the uppercase letter "M".
U+2002
En Space
Space equal to half the width of the em space.
U+2003
Em Space
Space equal to the width of the em space.
U+2004
Three-per-Em Space
Space equal to one-third of an em space.
U+2005
Four-per-Em Space
Space equal to one-quarter of an em space.
U+2006
Six-per-Em Space
Space equal to one-sixth of an em space.
U+2007
Figure Space
Space equal to the width of a numeral character.
U+2008
Punctuation Space
Space used for punctuation.
U+2009
Thin Space
Space narrower than the regular space character.
U+200A
Hair Space
Very narrow space used for special purposes.
U+2028
Line Separator
Character used to separate lines in text.
U+2029
Paragraph Separator
Character used to separate paragraphs in text.
U+202F
Narrow No-Break Space
Non-breaking space narrower than the regular space character.
U+205F
Medium Mathematical Space
Space used in mathematical notation.
U+3000
Ideographic Space
Space used in East Asian scripts.
U+FEFF
Byte Order Mark (BOM)
Zero Width Non-Breaking Space, often used as a BOM.
Case Sensitivity: All whitespace characters are recognized based on their Unicode code points. Ensure that the correct character is used to avoid parsing issues.
Whitespace Sensitivity: Internet Object is not whitespace sensitive, meaning that the parser ignores the whitespaces surrounding the values and structural elements. However, any whitespace characters found within the values or strings themselves are preserved.
Reserved Characters: All listed whitespace characters are reserved and should not be used as part of identifiers or keys to prevent conflicts and parsing errors.
Best Practices:
Enhance Readability: Use whitespace characters like spaces and tabs to format your document for better readability.
Avoid Unnecessary Whitespace: While whitespace can improve readability, excessive or unnecessary whitespace can clutter the document.
Consistent Formatting: Maintain a consistent use of whitespace throughout the document to ensure uniformity and ease of maintenance.
Be Mindful of Invisible Characters: Some whitespace characters, like zero-width spaces, are invisible but can affect the parsing and rendering of the document. Use them only when necessary.
Structural characters define the structure of data within an Internet Object document. Below are the structural characters used in the Internet Object format:
,
Comma
U+002C
Separator between values
Used to separate items in arrays and objects
~
Tilde
U+007E
Record delimiter in collections
Indicates the start of a new record in collection
:
Colon
U+003A
Key-value separator
Separates keys from their corresponding values
[
Open Square Bracket
U+005B
Start of an array
Begins an array structure
]
Close Square Bracket
U+005D
End of an array
Ends an array structure
{
Open Curly Bracket
U+007B
Start of an object
Begins an object or dictionary
}
Close Curly Bracket
U+007D
End of an object
Ends an object or dictionary
---
Triple Hyphens
U+002D
Header and sections separator
Separates different sections of the document
#
Hash
U+0023
Comment start
Initiates a single-line comment
"
Double Quote
U+0022
String delimiter
Encloses string values
'
Single Quote
U+0027
String delimiter
Alternative to double quotes for strings
@
At Symbol
U+0040
Variable
Represents the start of variable
$
Dollar Sign
U+0024
Schema
Represents the start of a schema identifier
The Data Section in an Internet Object Document is where the actual data resides. An internet Object document can have one or more Data Section. It consists of one or more sections, each defined by a separator line (---
) and optionally accompanied by a section name and schema. The data itself can be represented as either a single object or a collection of objects, allowing for a flexible and structured approach to data representation. Following diagram shows the structure of the Data Section.
Each Data Section begins with a separator line (---
), which organizes the document into distinct sections. The separator line can include optional elements:
Section Name: Identifies the section and its purpose.
Schema Name: Defines the structure or constraints of the data, prefixed with $
.
ℹ️ The separator line must end with a newline character (
\n
) orEoF
(End of File).
The separator line can take on various forms for different levels of detail, each ending with a newline character (\n
) or EoF
(End of File):
Without Name and Schema: The simplest form, just the separator (---
).
With Section Name: The separator followed by a section name (--- employee
).
With Section Name and Schema: The separator followed by a section name and a schema name, separated by a colon (--- employee : $employee
).
With Only Schema: The separator followed by just the schema name (--- $employee
).
Omitting Section Name: In a multi-section document, the section name can be omitted only once. When omitted, the section name will be derived from the associated schema (e.g., --- $employee
implies that the section name is employee
).
Default Section Name and Schema: If both the section name and schema are omitted, the section name will default to data
, and a default schema will be used.
Unique Section Names: Each section in an Internet Object Document must have a unique section name. Duplicate section names are not allowed.
It is the simplest form of the separator line. It will use the default section name (data) and the default schema set for the document.
Here the section name is employee
. The schema will be the default schema set for the document.
Here the section name and schema are both are explicitly mentioned as employee
and $employee
respectively.
Here only the schema is mentioned. The section name will be derived from the schema name (employee). However, if the document the section name is already used, then it will be an error.
After the separator line, the data within a section is introduced. This data can either be a single object or a collection of objects. The flexibility in data representation allows the Internet Object Document format to handle various types of information efficiently.
Objects are structured data entities composed of key-value pairs. Each object is defined within curly braces {}
and can contain nested objects or other data types, forming a hierarchical structure.
Collections represent a list of objects, making it possible to include multiple records within a single Data Section. Each object within a collection is defined in the same way as a standalone object but is part of a broader collection context.
A single object can be represented after the separator.
It is not necessary to have a section separator for a single section document if there is no header or schema. Hence, the above example can be written as:
A collection is represented by listing objects, each prefixed with ~
on separate lines:
You can have an empty data section. An empty data section can be represented by just the separator line without any data.
It is not necessary to have a section separator for an entirely empty document.
An Internet Object Document can include multiple sections, each with its own data:
The Data Section, organized by separators and structured using objects and collections, offers a robust and flexible method for handling data within Internet Object Documents. This structure ensures that the documents are clear, consistent, and effective for a wide range of applications.
Literals are specific values that can be used within an Internet Object document. They represent special values and basic data indicators. Below are the literals used in the Internet Object format:
Case Sensitivity: All literals are case-sensitive. For example, True
or FALSE
are not recognized as valid literals.
Short vs. Long Forms: The short forms (T
, F
, N
) are convenient for brevity, while the long forms (true
, false
, null
) enhance readability and compatibility with JSON.
Arrays in Internet Object
An array is represented by a pair of square brackets, which may contain zero or more values. It begins with an open square bracket ([ U+005B) and ends with a close square bracket (] U+005D). Each value is separated by commas (,
U+002C
). Essentially, an array is expressed as a sequence of values separated by commas enclosed in square brackets.
Arrays can contain values of various types, including objects, other arrays, strings, numbers, boolean, and null.
A simple array of strings:
An array of objects:
An array with mixed values:
Arrays can be nested to create multi-dimensional data structures.
Two-dimensional arrays represent rows and columns:
Three-dimensional arrays represent collections of two-dimensional arrays:
An empty array is represented by a pair of square brackets with no values:
Empty values between array elements are not permitted. To include a missing value, you must explicitly specify a valid value such as null
. However, since the Internet Object specification neither assumes null
by default nor supports undefined
, any omission is strictly forbidden. Following are some examples of invalid array structures:
T
Boolean value True (short)
Case-sensitive
true
Boolean value True
Use interchangeably with T
F
Boolean value False (short)
Case-sensitive
false
Boolean value False
Use interchangeably with F
Inf
Number value Infinity
Represents positive infinity
-Inf
Number value Negative Infinity
Represents negative infinity
NaN
Number value Not a Number
Represents an undefined or unrepresentable value
N
Null value (short)
Case-sensitive
null
Null value
Use interchangeably with N
,
Comma
U+002C
Used as a value separator
[
Open Square Bracket
U+005B
Begins an array boundary
]
Close Square Bracket
U+005D
Closes an array boundary
Internet Object values must adhere to a specific set of data types, including Objects, Arrays, Strings, Numbers, Boolean values, and Null. The values can also be represented using specific literals such as T
, true
, F
, false
, Inf
, -Inf
, NaN
, N
, and null
. By adhering to this set of data types, Internet Object promotes consistency and interoperability across different implementations and use cases.
Strings in Internet Object
Like many programming languages, strings in Internet Object are a sequence of Unicode codepoints. They may be enclosed in quotation marks (" U+0022 or ' U+0027)
or remain free without being enclosed. One noticeable difference with Internet Object strings is they all preserve the whitespace found within the boundary!
The Internet Object strings can be written in three different formats (a) Open Strings (b) Regular Strings (c) Raw Strings. All of these formats have different ways of representing strings and handling escapes.
Work in Progress!
Objects are a fundamental element in Internet Object documents, providing a clear and intuitive way to represent structured data.
An object is expressed as a sequence of values (and key/value pairs) separated by commas (,
U+002C
). For simplicity, clarity, and ease of reading, Internet Object supports two modes for objects: Open and Closed. The Open mode does not require enclosing values in curly brackets and is only supported for top-level objects.
Symbol
Characters
Unicode
Description
,
Comma
U+002C
Used as a value separator
:
Colon
U+003A
Key-value separator
[
Open Square Bracket
U+005B
Begins an array boundary
]
Close Square Bracket
U+005D
Closes an array boundary
{
Open Curly
Bracket
U+007B
Begins an object boundary
}
Close Curly Bracket
U+007D
Closes an object boundary
Objects can contain values of various types, including other objects, arrays, strings, numbers, boolean, and null. Keys can also be attached to all or some of the values to provide more information and make the objects more self-explanatory. The keys are valid Internet Object string values, and any format of string (Open, Regular, Raw) can be used to represent them.
An object is essentially an ordered collection of values similar to CSV records.
Objects with child objects and arrays.
An object is not required to be wrapped inside the curly braces unless it is a child object. However, putting them in between the braces will not make it invalid.
Object structure also supports unique inline keys. In the following example isActive
, address
, and personalities
have associated keys.
Inline keys can be attached to all the values or some of them. When an object contains both types of values (key value and non-key value), the key-value pair must be placed after the non-key values (sequential values).
An empty object must be enclosed by curly braces.
An object can contain empty values. The following object contains two empty values, after the name John Doe, and the second before the address.
Trailing empty commas are ignored.
As the object keys are valid Internet Object String values, any format of string (, , ) can be used to represent them.
Internet Object is a schema-first format. When a is applied, values can be accessed using their respective keys. Without the , values without keys can be accessed using their respective index position; or through keys, if they are provided.
Numbers in Internet Object
Providing accurate numerical representation for various applications, Internet Object numbers offers a system that efficiently handles tasks ranging from simple counting to complex financial calculations. It supports three numeric data types—Number, BigInt, and Decimal—designed specifically to meet different numerical requirements in modern applications.
[Diagram: The Number Values]
Number (64-bit floating-point): Ideal for fractional or general floating-point values.
BigInt: Suited for very large integers that exceed the 64-bit limit.
Decimal: A fixed-precision type that stores exact numeric values with a set number of decimal places, making it critical for high-precision applications like financial calculations.
Internet Object supports a variety of number formats, allowing:
Integers to be expressed in decimal (base 10), binary, octal, or hexadecimal.
Floating-point numbers to be written using scientific notation.
IEEE 754 special values such as NaN and Infinity to be represented.
This is the simplest and the most basic type of format is the Open string format. As the name suggests, open strings are not enclosed within any sort of enclosures or quotation marks. This free and open type of string starts with any non-whitespace codepoint. They end when any structural character(s) is encountered or when the end of the document is reached.
Open strings can not start or end with the whitespace character. However, whitespace characters within the strings are preserved. The quotation characters ( " U+0022 or ' U+0027)
do not require to be escaped.
A simple open string. Notice it is not enclosed in any sort of enclosures.
Quotes in the open strings don't cause termination.
The following object contains three open string Unicode values.
To create a multiline string, you don't need any escaping mechanism. In the following case, a string is spread over the five lines.
In order to keep things simple, the open string format does not support character escaping. If the text does not fit into open string format, other formats such as regular string can be used.
In some cases (such as representing regular expressions) open string or regular string is not quite an efficient way to represent the string. Raw strings encapsulate the series of Unicode characters inside a single quote (' U+0027)
character.
The raw string is used for representing text with complex characters. The following string represents an application path in the windows directory, it uses both characters i.e colon and backslash which makes it difficult to be represented using open strings and regular strings format.
Raw strings format simplifies the complex string representation and enhances the readability by preventing escapes. The following regular expression is much more readable if represented using raw string format.
As with other format of strings, raw strings can also deligently handle Unicode characters.
Raw string preserves structural characters, newline characters, and whitespace while parsing
The Raw string does not support the character escaping with a reverse solidus. Every character inside the raw string is treated as exactly the same including reverse solidus (\ U+005C)
.
Only the single quote within the Raw string must be escaped to avoid string termination. To avoid the string termination, single quotes within the raw strings must be escaped using double single quotes.
Open strings are a simpler and beautiful way to represent textual data. In many cases, open strings do not fit. One such issue is, they can't start with whitespace, contain structural characters such as colon (:)
, comma (,)
, or supports complex escaping. Regular strings solve this problem by letting the user enclose the string in the quotation marks (" U+0022)
. This makes a regular string look similar to the strings found in most programming languages.
Some of the examples of regular strings are...
A string with leading and trailing white spaces
A single quote within the regular strings doesn’t cause termination.
Regular string represented using the Unicode characters.
The new-line characters such as line-feed and carriage-return in the string are preserved while parsing.
Any character may be escaped using a regular string. The Regular String format support escaping any characters. The control characters and the double quotes inside the string must be escaped.
All the characters in the special code point shown in the table below may be escaped by placing Reverse solidus (\ U+005C)
before that character. If the character is among the special code points and it is a double quotation mark (" U+0022)
then it must be escaped.
Caracter
Description
Code Point
"
U+0022
Quotation mark
b
U+0062
Backspace
f
U+0066
Form feed
r
U+0072
Carriage return
n
U+006E
Line feed
t
U+0074
Horizontal tab
u
U+0075
Four-digit Unicode point. Where U is case-insensitive
x
U+0078
Two-digit Unicode point. Where X is
case-insensitive
Characters that are not included in the above table will not be escaped by the parser by placing Reverse solidus. In the following example, character a
will not be escaped as it is not a special code point character. Similarly, character u
will not be escaped as it is not followed by a four-digit hexadecimal number.
In the above example, \n
and \t
will be escaped as it is among the special code point characters.
A double quotation mark within the string must be escaped as shown in the above example because it will cause the string termination.
A backslash before any non-escapable character will not have any effect! In the following example, both values represent "John Doe"
.
If the character is in the Basic Latin Unicode character range i.e from (U+0000)
to (U+00FF)
then it may be escaped by representing it as the reverse solidus followed by the lower case letter "x"
, followed by two digits hexadecimal number.
If the character is in the range i.e from (U+0000)
to (U+FFFF)
, then it may be escaped by representing it as a reverse solidus followed by a lower case letter "u"
, followed by four hexadecimal digits. For example, "\u00AF"
.
The software implementation that parses the Internet Object text requires checking the string values in the objects and their members for equality. In order to achieve interoperability, the software implementations must ensure that the Unicode character units of the transformed textual data are compared code by code numerically with the other string. And at the same time, it must agree on all the cases of equality and inequality of two strings.
In order to compare two strings, the software implementations must first evaluate the strings and convert the escape characters into respective Unicode points and then compare them.
"cafe\u0301"
is the same as "café"
"\u000A"
is the same as "\n"
"\x0A"
is the same as "\n"
"\ud83d\ude00"
is the same as "😀"
"\uD83D\uDE00"
is the same as "😀"
"\ud83d\udcaf"
is the same as "💯"
"\uD83D\uDCAF"
is the same as "💯"
The Decimal or a base 10 number is the most common format that we use today. It contains an integer component, that may be prefixed with a minus sign (- U+002D)
or plus sign (+ U+002B)
and maybe followed by a fraction part and/or exponent part.
Some valid examples of decimal numbers are.
Unbounded integer values for handling extremely large numbers
The BigInt type in Internet Object represents arbitrary-precision integers that can handle numeric values exceeding the limitations of standard 64-bit number representations. This makes BigInt essential for applications that need to process extremely large whole numbers with perfect precision, such as cryptographic operations, large-scale counting, or mathematical computations requiring unbounded integer arithmetic.
Unlike the regular Number type, which is limited to safe integers within approximately ±9 quadrillion (±2^53-1), BigInt can represent integers of arbitrary length, ensuring that large numerical operations remain exact regardless of magnitude.
A BigInt value in Internet Object is represented as an integer with the n
suffix:
BigInt values can be expressed in different numeric bases:
BigInt values are not subject to the precision limitations that affect standard floating-point numbers:
BigInt values represent whole numbers only and do not support fractional components:
BigInt is a distinct data type and cannot be implicitly mixed with regular numbers:
When used with Internet Object schemas, BigInt types can be defined and validated:
The BigInt type supports these validation properties:
min/max: Validates value range
choices: Limits valid values to a predefined set
optional: Specifies if the field is optional
null: Determines if null values are allowed
BigInt supports these arithmetic operations:
BigInt supports standard bitwise operations:
BigInt values can be compared as expected:
BigInt types are particularly valuable in:
Cryptography - key generation, hash calculations
Large-scale counting - web analytics, statistics
Financial ledgers - tracking very large monetary amounts
Mathematical computations - number theory, combinatorial calculations
IDs and timestamps - high-precision time tracking, unique identifiers
Use for Whole Numbers Only: BigInt is designed for integer operations and does not support fractional values.
Consider Performance Implications: BigInt operations may be slower than standard number operations, especially for very large values.
Explicit Type Conversions: When interfacing with systems that don't support BigInt, explicitly convert values to ensure compatibility.
Range Constraints: Even though BigInt can represent arbitrarily large values, consider setting practical min/max limits in schemas to prevent excessive resource usage.
Avoid Mixing with Regular Numbers: Maintain type consistency by not mixing BigInt with standard numbers in operations.
When implementing or working with BigInt values, keep these points in mind:
Memory Usage: BigInt values can consume significantly more memory than standard numbers, especially for very large values.
Serialization: When serializing to formats that don't natively support BigInt (like standard JSON), values must be represented as strings or custom formats.
Integer Division: Division with BigInt always produces integer results (truncated toward zero), which may require special handling for fractional calculations.
No Decimal Point: BigInt does not support decimal points or fractional values; for such needs, use the Number or Decimal types.
Implementation Model: Many BigInt implementations use a variable-length sequence of bits to represent integers of arbitrary size, providing theoretical support for numbers limited only by available memory.
Characters out of the above range are represented as the 12 Character sequence, i.e UTF-16 surrogate pair. So, for example, a string containing only the "🎛"
character (U+1F39B)
can be represented as "\uD83C\uDF9B"
.
The 😀 (U+1F600)
is represented using UTF-16 surrogate pair \uD83D\uDE00
The "💯"
(U+1F4AF)
is represented using UTF-16 surrogate pair "\uD83D\uDCAF"
.
-
U+002D
Minus Sign
+
U+002B
Plus Sign
.
U+002E
Decimal Point
0
U+0030
Zero
E
U+0045
Latin uppercase letter E
e
U+0065
Latin lowercase letter e
Fixed-precision decimal values for financial and high-precision computations
The Decimal type in Internet Object provides a fixed-precision decimal representation designed for applications that require exact numeric values, especially when dealing with financial calculations or other scenarios where floating-point precision issues could lead to significant errors.
Unlike standard floating-point numbers (which may suffer from approximation issues), Decimal types store exact numeric values with a defined precision and scale, ensuring accurate and predictable arithmetic operations.
A Decimal value in Internet Object is represented as a number with the m
suffix:
Decimal values also support scientific notation:
Each Decimal value is defined by two key properties:
Precision (M): The total number of significant digits in the number
Scale (D): The number of digits after the decimal point
For example, in 123.45m
:
The precision is 5 (total digits: 1,2,3,4,5)
The scale is 2 (decimal digits: 4,5)
Decimal values maintain their precision throughout operations, making them ideal for financial calculations where exact values are required:
When precision or scale constraints require rounding:
If a value has more decimal places than the specified scale, it's rounded using the "half up" rounding method (≥ 5 rounds up)
If the rounded value would exceed the precision, a validation error is raised
When used with Internet Object schemas, decimal types can be precisely defined and validated:
The decimal type supports various validation properties:
precision: Restricts the total number of significant digits
scale: Controls the number of decimal places
min/max: Validates value range
choices: Limits valid values to a predefined set
optional: Specifies if the field is optional
null: Determines if null values are allowed
Internet Object provides mechanisms for converting between different numeric types:
Decimal values with different precision/scale configurations can be converted to ensure compatibility during operations:
The convert
method allows:
Increasing scale: Adds zeros to the right (e.g., 123.45 → 123.4500)
Decreasing scale: Truncates with rounding (e.g., 123.456 → 123.46)
Adjusting precision: Ensures the new value fits within specified constraints
When working with decimals of different formats:
Two decimals must have the same precision and scale for direct comparison.
Decimals with different configurations must be converted to a common format.
Conversion may fail if the target precision cannot accommodate the value.
It's important to note that both the integer and fractional parts of a decimal must fit within the precision minus scale. For example:
Example of decimal comparison with conversion:
Decimal values can be compared with other decimal values as expected:
Decimal types are particularly valuable in:
Financial applications - currency calculations, banking, accounting
Scientific computing - when exact decimal representation matters
Regulatory compliance - when calculations must be exactly reproducible
Monetary APIs - for consistent data exchange with financial systems
Explicitly Define Precision/Scale: Always specify the precision and scale for decimal types in schemas to ensure data consistency.
Use for Financial Data: Prefer decimal type over floating-point for monetary values to avoid rounding errors.
Consider Storage Implications: Decimal types require more storage than standard numbers due to their exact representation.
Range Validation: Use min/max constraints to ensure decimal values stay within expected business bounds.
Manage Precision/Scale When Combining Decimals: When performing operations across decimals with different precision/scale, explicitly convert them to a common format or ensure your target format can accommodate the result.
When implementing or working with decimal values, keep the following points in mind:
Precision: Decimal types preserve exact values without rounding errors, unlike floating-point numbers. This makes them ideal for financial and high-precision applications.
Interoperability: Decimal types are compatible with database systems and financial applications that require exact decimal arithmetic, ensuring seamless data exchange and integration.
Performance: While operations on decimal values may be slower than native floating-point operations, they provide guaranteed precision, which is crucial for applications where accuracy is paramount.
Consistency: Decimal calculations produce the same results regardless of the platform or implementation, ensuring reliable and predictable outcomes across different environments.
Implementation Model: The underlying implementation of the Decimal type uses a coefficient-exponent model, similar to database systems like SQL Server and Oracle. This provides a strong basis for interoperability with enterprise data systems and ensures consistent behavior.
Specs coming up soon!
Internet Object supports binary non-fractional integer values. The hexadecimal integer may be prefixed with optional plus or minus signs and must start with "0b"
or "0B"
literal characters. It is then followed by one or more 0
or 1
digits.
Chars
Char Code
Detail
-
U+2010
Minus Sign
+
U+002B
Plus Sign
0
U+0030
Zero
B
U+0058
Latin uppercase letter B
b
U+0078
Latin lowercase letter b
Binary Digits
0 or 1
U+0030
to U+0031
ASCII digits - zero to one
Some examples of Binary Integer values...
Internet Object supports octal non-fractional integer values. The octal integer may be prefixed with optional plus or minus signs and must start with 0c
or 0C
literal characters. It is then followed by one or more octal digits.
Some of the octal integer values are...
Chars
Char Code
Detail
-
U+2010
Minus Sign
+
U+002B
Plus Sign
0
U+0030
Zero
C
U+0043
Latin uppercase letter C
c
U+0063
Latin lowercase letter c
Octal Digits
0-7
U+0030
to U+0037
ASCII digits - zero to seven
Chars
Char Code
Detail
-
U+2010
Minus Sign
+
U+002B
Plus Sign
Inf
U+006C U+006E U+0066
The keyword Inf
NaN
U+004E U+0061
The keyword NaN
Internet Object supports hexadecimal non-fractional integer values. The hexadecimal integer may be prefixed with optional plus or minus signs and must start with "0X"
or "0x"
literal characters. It is then followed by one or more hexadecimal digits. The hexadecimal number from A
through F
can be lower or upper case.
Chars
Char Code
Detail
-
U+002D
Minus Sign
+
U+002B
Plus Sign
0
U+0030
Zero
X
U+0058
Latin uppercase letter X
x
U+0078
Latin lowercase letter x
Hexadecimal Digits
0
-9
U+0030 to U+0039
ASCII digits - zero to nine
A
-F
U+0041 to U+0046
Latin alphabets uppercase - A
to F
a
-f
U+0061 to U+0066
Latin alphabets lowercase - a
to f
Some examples of Hexadecimal Integer values...
Internet Object supports single-line comments. Internet Object comments start with a hash sign (# U+0023
) and end when the line terminates. You can place comments anywhere inside the document, and any code written after the hash sign on the same line is ignored by the parser.
Comments in Internet Object serve multiple purposes:
Document Structure Elucidation: Providing context for different sections of the document.
Schema and Field Description: Offering explanations for data structures and individual fields.
Metadata Provision: Including information about the document itself, such as version or purpose.
Code Segmentation: Improving readability by logically separating different parts of the document.
Best Practices:
Be Clear and Concise: Use comments to clarify complex sections, but avoid stating the obvious.
Keep Comments Updated: Ensure that comments reflect any changes in the code to prevent misinformation.
Avoid Overusing: Excessive comments can clutter the document. Use them judiciously to highlight important information.
Place Comments Appropriately: Position comments near the relevant code or data structure to maintain context.
Effective use of comments enhances document comprehensibility and maintainability by providing contextual information and explanations for data structures and values.
Booleans in Internet Object
Boolean values are represented using T
and F
. true
and false
. The true values can be represented using T
or true
keywords. The false values can be represented using F
or false
keywords.
Chars
Char Code
Detail
T
U+0054
The uppercase letter T
F
U+ 0046
The uppercase letter F
true
U+0074 U+0072 U+0075 U+0065
The keyword true
false
U+0078
The keyword false
In the following example, isActive
is of boolean type and will accept only boolean values i.e T
, F
, true
, false
.
Nulls in Internet Object
The null values are represented using N
and null
keywords.
In the following example, the null value can be passed for age.
Chars
Char Code
Detail
N
U+2010
The uppercase letter N
null
U+006E
U+0075
U+006C
The Keyword null
The Internet Object format uses UTF-8 as the default and mandatory encoding. This ensures that all implementations can reliably read and write text consistently. While you can use other encodings like UTF-16, UTF-32, or ASCII, keep in mind that not all systems might support them.
Adding a Byte Order Mark (U+FEFF
) at the start of your Internet Object text won’t cause issues—the parser will simply treat it as a space. However, it's generally a good idea to omit the BOM unless you have a specific reason to include it.
A Collection is a record aggregator that allows sending multiple records over the internet without repetitively defining a key-value pair. The Collection embeds more than one independent record in the IO document.
Collection reduces the complexity of defining key-value pairs every time a record is sent over the internet. Thus, simplify application development by offering data parallelism and operational simplicity.
The collection permits an internet object document to have multiple records with different types and structures independent of each other.
The Collection must be represented with the tidal sign (~ U+007E)
followed by the object and separated by the whitespace as shown in the Collection structure.
The tidal sign enables the parser to identify the next record. Here is a code snippet that shows how to represent a collection.
A collection may be created with or without explicitly defining schema definition for the records. However, it is always recommended to define a schema for the collections of records.
A Simple Collection can be created in the data section of the Internet object document by prefixing each record with a tidal sign (~ U+007E)
. It enables the parser to identify the next record when multiple records are sent.
In the Simple Collection as the schema is not defined the type and the structure of collection records can differ.
An Explicit Collection is created by explicitly defining the schema for the collection of records. Prefixing schema with the tidal sign (~ U+007E)
enables the parser to understand the multiple records that may be sent according to a particular schema definition.
Here in the code snippet, multiple records are passed in the data section of the document using Collections.
For frequently passing object data between the system over the internet there is a need to stream objects over a single connection.
As the collection enables embedding more than one independent record in the document because of its nature it allows streaming real-time data changes. So that the application can react immediately to the changing events in real-time.
Multiple records can be sent in batches after validating with the schema as,
After you have received the first batch of records, the collection allows you to receive more records for the same collection separately. The Internet Object processor should take care of merging the stream of data into the same collection.
If the schema is not defined, the records in the collection can have a different structure from each other across the document. Here is the code snippet,
In the above example, the schema is not defined for the collection records so it will be parsed as,
Sending an empty record is valid only if all the variables defined in the schema are either set to null
or optional
or both.
Here in the above code snippet A
and B
is null and C
is optional thus sending an empty record is valid. Because just sending a "~"
means an empty object { }
.
In the above example, A
is null, B
is null and optional and C
is optional. So all the keys are either optional
or null
or both thus sending an empty record is valid. Because just sending a "~"
means an empty object { }
.
In the above example, the invalid record fails while parsing as the name variable is not optional
or null
. On the other hand, the age
variable is optional
as well as null
so it is valid to not pass any value for the age variable.
The Collection enables the parser to parse the rest of the document even if the previous record fails to execute.
If the record fails while parsing, that record state becomes invalid and it does not stop parsing the rest of the document.
The Internet Object document promotes reusability through variables. It allows defining variables that can be applied to simplify schema and definitions, obfuscate values, or reduce the data size. Every key defined in the definition section can be used as a variable.
Internet Object variables can be categorized into two groups.
Value Variables
Schema Variables
The Value variables are used to directly access and reuse values.
In the above snippet records
, y
, n
, and rgb
are the value variables.
The schema variables start with $
sign and it is used to directly access and reuse schema.
In the above code snippet, the schema variable address
is reused in another schema variable named person
.
The value variables and schema variables enable the reuse of definition.
Variables are also used for hiding critical information with modified content to enforce data protection and security.
The following example demonstrates how one can pass critical information over the internet using variables.
The above code snippet represents secrectKey
as s
and activationKey
as a
saved on the client-side. This information is securely passed over the internet using variables as shown below,
The receiver will receive the following information without compromising on data security.
The use of variables helps to reduce the code size as it enables definition reuse that ultimately reduces bandwidth utilization.
In the above code snippet, the schema variable address
and accountDetails
are used in the person
schema definition. So, rather than creating a similar schema multiple times for address
it can be created once and reused multiple times in the document.
Variables improve schema readability by grouping similar and reusable codes and limiting line length.
Apart from the schema, the IO document header can have definitions. The definitions allow you to define schema, variables, metadata, and much more. In essence, the definitions are the collection of key-value pairs, with the following structure.
The definition must start with a tidal symbol (~ U+007E)
followed by a key-value pair. The key-value pair must be separated by a colon (: U+003A)
.
Element
Unicode
Details
~
U+007E
Tilde - Starts the definition
:
U+003A
Colon - Key and Value Separator
Key
N/A
Value
N/A
WS
WhiteSpace Char
0 or more white-space character
Simple definitions such as meta-data declaration can be written as shown in the code snippet below.
Any value defined in the definition section can be used as a variable. The dollar $ prefix can be used to declare schema and/or consume the variable value. If the key starts with $
a sign the variable will be treated as a schema and handled likewise.
Here in the code snippet, y: yes
and n: no
are used as value definition similarly keys in the schema prefixed with $
sign represents schema definitions.
The header section of the internet object document can have single or multiple schema definitions
In the above example, the schema definitions are created for reuse to improve the readability of a schema. The schema definition created for address is reused in the person
schema definition.
The string key, as defined in the .
The values as defined in the .
Internet Object is a schema first format! The Internet Object schema defines the shape and structure of Internet Object documents and helps the developers and designers to create the object definitions in a simple, concise, clear, and human-readable way.
The schema asserts the shape of the IO objects and ensures the validity of the data during the serialization and deserialization process.
The schema may be attached and placed in the document header or kept separately. Internet Object schema provides a simple way to define the object structure!
In its simplest form, an object schema is just an object with a list of required members. The following example represents the schema with five keys i.e name
, age
, address
, isActive
, remark
which are separated by ","
.
A schema can be embedded in the IO document header or defined independently. The following example shows the schema embedded into the document itself. The upper section declares the schema while the lower section contains the object.
The schema in the previous example lacks the datatypes. Since the keys are not associated any data type, the default datatype is any
. That means the value for the name
field could be John Doe
, T
or 999
or anything.
We can attach types and sub-schema to the keys to add more constraints and clarity.
Here name
, age
, address
, isActive
, and remark
are defined with the type string
, int
, object
, bool
and string
respectively.
A string type can be defined with the members such as type
, default
, choices
, pattern
, minLen
, maxLen
, len
, optional
, and null
. Schema of the string TypeDef should be written as,
The TypeDef schema ensures the validity of string
MemberDefs.
The first member of the typedef is type
. The string can define with a type string
or its derived types i.e email
, url
, datetime
, date
, time
. Here the next snippet shows, how the string type and its derived types can be defined.
The second member in the string
typedef is default
. Here is how the default values can be defined for a string.
The default value is applicable only if no other value is provided for the key.
If for a key, null is set to true then it must be replaced by its default value.
The default value when set must match with the data type of a key.
The next member in the string
typedef is choices
. If set, the choices must be an array of strings. Here the snippet shows how the choices
can be added to member variables in a string so that it is restricted to the fixed set of available choices.
The value of the pattern
must be a String. The string value passed should be a valid Regular Expression. The data will be then validated according to the Regular Expression and passed accordingly. Regular Expression can be defined in the schema by using pattern
in the schema of a string.
Different versions of schema can be created and executed for patterns in the programming environment. But to remain compatible with the host environment, it is better to stick to the constraints described below.
A single Unicode character, other than the special characters specified below matches itself.
(.
U+002E
): Matches any character except newline character (U+000A
).
(^
U+005E
): Matches only at the start of the string.
($
U+0024
): Matches only at the end of the string.
(...
): Assembles the sequence of regular expressions into a single regular expression.
(|
U+007C
): Matches the regular expression either preceding or following with the "|"
symbol.
[abc]
: Matches any of the characters enclosed by the square brackets.
[a-z]
: Matches the range of characters enclosed by the square bracket.
[^abc]
: Matches any character not in the list.
[^a-z]
: Matches any character out of the given range.
(+
U+002B
): repeats the preceding regular expression one or more times and is greedy as they match as many items as possible.
(*
U+002A
): repeats the preceding regular expression zero or more times and greedy as they match as many items as possible.
(?
U+003F
): makes the preceding regular expression optional. Greedy, matches zero or one preceding regular expression.
+?
, *?
, ??
: The *
, +
, and ?
qualifiers are used to match as much text as possible which is not always desired.
(?!x
), (?=x
): Negative and positive lookahead.
{x}
: Match exactly x occurrences of the preceding regular expression.
{x,y}
: Match at least x and at most y occurrences of the preceding regular expression.
{x,}
: Match x or more occurrences of the preceding regular expression.
{x}?
, {x,y}?
, {x,}?
: Lazy versions of the above expressions.
The value of maxLen
must be a non-negative integer. The string instance is valid only if the number of characters in the string will be less than or equal to the value of maxLen
. Here is the snippet showing how to assign maxLen.
The value of minLen
must be a non-negative integer. The string instance is valid only if the number of characters in the string will be greater than or equal to the value of minLen
. Here is the snippet showing how to assign minLen.
The value of length represented as len
must be a non-negative integer. The string instance is valid only if the number of characters in the string will be equal to the value of len
. The code snippet shows how to assign len
.
The member of a string type can be set to optional. Here is the code snippet that demonstrates how a string can be set to optional.
A string when set to null: true
will accept null values. The snippet below shows how to set a nullable string.
Here are some of the examples that demonstrate how to define string member definition.
In the above snippet, name
can be kept optional
and null
. When no value is passed for the name then, its default value is set to anonymous
. The name
should be a string containing characters from a to z (upper or lower case) with a minimum length of 5 and a maximum length of 50.
Here the code snippet shows that the users can only select the department provided in choices
i.e input is restricted to the set of available departments.
In the above code snippet, users can select the location provided in choices
i.e the input is restricted to the set of available locations ( locations are enclosed in double-quotes to pass numeric data as string ).
Any data type is used to assign any type of value to the variables. It is useful in the case where either the actual type is not known or types are needed to be dynamically assigned. Thus for undefined type, the default type will be always set to any
.
Any type can be defined with the members such as type
, default
, choices
, anyOf
, optional
, and null
. Schema of any
TypeDef should be written as,
The TypeDef schema ensures the validity of any
MemberDefs.
As with most of the types in Internet Object, the first member of typedef is type
. The next snippet shows different ways to define the members a
, b
, and c
as any
.
The second member in the any
typedef is default
. Here is how the default values can be defined.
Here, the default value for a
is Monday
and default value for b
is null
.
The choices
restricts the member to be strictly bound with the unique constant values. If set, the choices must be an array of any type of value. The code snippet shows how choices
can be defined for the any
type.
In some cases, a member must accept different kinds of values. Such as, a number could be a multiple of 3 or a multiple of 5; they could be a string or number but not that of other types; two different formats of the schema. TheanyOf
allows schema designers to define members that can accept different kinds of constrained values. It accepts an array of MemberDef and/or schema and types.
This snippet explains how a
, b
and c
can accept various kinds of values.
When set null
to true, a member can accept null values. Here are some of the ways through which a member of any type can accept null values.
When set optioanl
to true, a member can be marked as optional. Here are some of the ways through which a member of any type can be made optional.
Some of the valid examples of members with any
are...
In its simplest form, an object schema can be just a set of keys separated by a comma ","
. Such schema ensures that the object has values for all the required keys. Schema can be defined in the header section of the Internet Object document or maintained separately. The following example shows how easy to create a basic schema-first Internet Object document.
Defining members with simple types are good in some cases, however, they do not provide a mechanism to create complex constraints. In such cases, the MemberDefs (The member's definitions) are used to create complex type definitions. The memberDefs are objects designed to impose complex constraints on the associated member. For example, the following schema allows the member phrases
to accept string values only if the length is between 20 and 200 characters. It will invalidate the value when the constraint is not match.
Meberdef is generally represented by enclosing data types or schema followed by optional comma-separated-values, and conditions: value pairs all inside the curly braces as shown below.
The following example represents a schema with MemberDef.
In the above example, the schema has a complex type definition which is created using MemberDef. The variable name
has type
, min
and max
as its members. The variable age
has type
, and max
as its members. Similarly, gender
has type
and choices
as its members.
The above representations are valid and will successfully validate the following object.
An object schema can be defined with the child objects and arrays schema. The child object defined in the schema definition must be enclosed inside the curly braces.
In the above example, the key variable address
has a child object having four keys i.e street
, city
, state
and zip
. Similarly, the key skills
is defined with an array schema having a default value null
and array
of type strings
.
The above representations are valid and will successfully validate the following object.
Internet object documents may have multiple schemas. The document containing multiple schemas can be reused and/or nested to reduce code length in such a case the schema must be defined before use.
The schema must be prefixed with a dollar sign ($
U+0024
) before reusing the schema in the multiple schemas definitions. So that the parser will identify the schema variable and will map values to the respective keys according to the Schema Definition.
In the above example, the address
schema is represented using a dollar sign ($
U+0024
), so that it can be reused inside another schema personalDetails
.
In the above example, the header section has three schemas i.e address
, person
and accountDetails
. The address
schema is used inside the person
schema and person
schema is used inside the accountDetails
schema making the schema representation nested.
The above representations are valid and will successfully validate the following object.
To map values from the data section to the variables in the header, the parser will first check the header section for the default schema definition that starts with the word "schema"
. Once the parser finds out the default schema, all values will be checked by matching them with the variables. If they are matched then they will be mapped to the schema variables.
If the values are not matched with the variables the parser will check for the other schema definition.
In the above example, the parser will first check the header part for the schema definition with the word "schema"
. Once it is found it will start validating all the variables in the schema i.e name,
age
, address
, and isActive
to the values in the data section if it matches then it will be mapped successfully.
If the header section of the document does not contain schema definition and only contain the value definition and/or meta-data as shown in the example below. The values will be mapped to the positioned variables after getting the values from the value variable.
If the header section in the document is empty, then the parser will map all values in the data section to the respectively positioned variables.
If the header section does not contain a definition and only contains schema. In such a case the parser will map the values to schema variables as shown in the example below.
A schema may contain the optional and nullable key. The key can be set to optional
by suffixing it with a question mark (?
U+003F
). Similarly, the key can be set to null by suffixing it with an asterisk(*
U+002A
).
In the above example, the defined schema contains both nullable and optional keys. The key gender
is set to optional
as well as null
and key age
is set to optional
.
Internet object specifies email
, url
, datetime
, date
and time
as derived types of string and also provides built-in support for them.
The following snippet represents a string and its derived types.
Here the, name
is of string type and will only accept strings. Similarly, emailId
, profileUrl
, journyDate
departureTime
and bookingDatetimedate
are of different types such as email
, url
, date
, time
, datetime
. Therefore they will only accept values with the defined types for the respective variable.
The Internet Object schema does not enforce schema designer to provide the type of the member. If not provided, such members are marked as any
type. That means, they can be assigned any . For more information see the datatype.
It is always recommended to define to which datatype members belong. Internet Object members can be assigned any allowed datatype such as , , , , , , , , , , , , , or .
Schema Definition represents the collection of key-value pairs declared using schema in the header section of the document. For more details refer to .
The internet object schema defines six data types that include , , , , , , , , , , , , , or .
The code snippet shows how to define an email-id
in the Internet Object Document.
The choices
can be added to member variables in the email so that it is restricted to the fixed set of available choices. Choices must be an array of valid emails. Here, the snippet shows how to add choices for the string subtype email.
User may specify pattern
for the email by defining pattern
as shown in the snippet below.
Derived from String, the Internet Object Date is an ISO 8601 compatible date format. It can be represented as YYYY-MM-DD
or YYYYMMDD
i.e It can be passed with or without separators (- U+002D).
Value
Description
Optional
Default
Example
YYYY
Four-digit decimal number
(0000-9999
)
No
-
2020
MM
two-digit decimal number (01-12
)
Yes
01
04
DD
two-digit decimal number (01-31
)
Yes
01
30
It prescribes a minimum four-digit year format from a range 0000
to 9999
to avoid the year 2000
problem. However, years from a range 1583
to 9999
are automatically allowed by a standard, while years prior to 1583
can only be used by mutual agreement of the partners in information interchange.
Date with separators:
Date without Separators:
YYYY-MM-DD
= 2020-02-17
YYYYMMDD
= 20200217
YYYY-MM
= 2020-02
YYYYMM
= 202002
YYYY
= 2020
YYYY
= 2020
Here is the code snippet demonstrates, how to define and use date type.
Other than regular string, raw string, and open string, a string can be passed as an Internet email as it is predefined in the parser. The email format is derived from the recommended by W3C. Internet Object Email Format does not follow the RFC-5322 email representation as it is too strict to implement for the users.
Email format follows the syntax specified in the
The Email is derived from the String type, hence it shares the same as the String. However, Email enforces additional constraints with the respective email format.
The Date is derived from the String type, hence it shares the same as the String. However, Date enforces additional constraints with respective date format and the same is applicable to the Date MemberDef.
A number type can be defined with the members such as type
, default
, choices
, min
, max
, multipleOf
, divisibleBy
, optional
and null
. Schema of the number TypeDef should be written as,
The TypeDef schema ensures the validity of number
MemberDefs.
The first member of the typedef is type
. The number can be of type number
or its derived types i.e int
, int16
, int32
, byte
. Here the next snippet shows how the number
type and its derived types can be defined.
The second member in the number
typedef is default
. Here is how the default values can be defined for a number.
Rules for default:
The default value is applicable only if no other value is provided for the key.
If for a key, null is set to true then it must be replaced by its default value.
The default value when set must match with the data type of a key.
The choices
can be added to member variables in numbers so that the input values are restricted to the fixed set of available choices. Choices must be an array of numbers. The code snippet shows how to add choices.
The max
represents the maximum value of a key, that must be a number. The numeric instance max
is valid only if its value is less than or equal to the value of the max
. Here is the snippet that shows how to set max
value for a number.
The min
represents the minimum value of a key, that must be a number. The numeric instance min
is valid only if its value is greater than or equal to the value of the min
. Here is the snippet that shows how to set min
value for a number.
The multipleOf
is used to restrict the value to multiples of a given number. The Value of multipleOf
must be a positive integer. The code snippet shows how to restrict the input value to the multiple of the desired number.
The divisibleBy
is used to restrict the value to divisible by of a given number as shown below.
The member of a number type can be set to optional
. Here is the code snippet that demonstrates how a number can be set to optional.
A number when set to null: true
will accept null values. The snippet below shows how to set a nullable number.
Here are some of the examples that demonstrate how to define number member definition.
The code snippet shows how to define an url
In the Internet Object Document.
The choices
can be added to member variables in the url
so that it is restricted to the fixed set of available choices. Choices must be an array of valid url
. The code snippet here shows how to add choices for the url.
User may specify pattern
for the url
by defining pattern
as,
Time can be represented as, HH:mm:ss.SSS
or HHmmss.SSS
i.e it can be passed with or without separators (: U+003A)
.
It uses a 24-hour clock system. Midnight is a special case and it may be referred to as "00:00"
or "24:00"
. However, ISO 8601-1: 2019 no longer permits "24:00"
.
Value
Description
Example
HH
24-hour clock hour, (00-23
)
01
mm
Minutes, Decimal number (00-59
)
46
ss
Seconds, Decimal number (00-59
)
55
SSS
Milliseconds, three-digit decimal number (000-999
)
500
Time with separators:
Time without Separators:
HH:mm:ss.SSS
= 05:24:34.555
HHmmss.SSS
= 052434.555
HH:mm:ss
= 05:24:34
HHmmss
= 052434
HH:mm
= 05:24
HHmm
= 0524
HH
= 05
HH
= 05
The code snippet demonstrates how to define and use time
In the Internet Object Document.
Internet Object DateTime is inspired by the ISO 8601 format. DateTime can be passed as a string and is represented in the following ways, with or without separators.
Date Time format with separator: YYYY-MM-DDThh:mm:ss.SSSZ
Example : 1997-07-16T19:20:30.500+01:00
Date Time format without separator: YYYYMMDDThhmmss.SSSZ
Example: 19970716T192030.500+0100
The DateTime is the string type and therefore there is no need to enclose it in the double quotation mark (" U+0022)
as parser identifies it as a string.
Here, T
is a delimiter used to separate the date
from the time
. The time
portion in the DateTime
object must be preceded by T
. Z
represents the time zone designator (+hh:mm
or -hh:mm
).
Symbol
Represents
Range
Default Value
YYYY
Year
1990, 1991...
-
MM
Date
01, 02 ... 11, 12
01
DD
Month
01, 02 ... 30, 31
01
HH
Hour
00, 01 ... 22, 23
00
mm
Minute
00, 01 ... 58, 59
00
ss
Second
00, 01 ... 58, 59
00
SSS
Millisecond
000, 001 ... 998, 999
000
Symbol
Character
Unicode
Description
-
Hyphen
U+002D
Used to separate parts of the date
:
Colon
U+003A
Used to different parts of time
.
Period
U+002E
Used to separate seconds from milliseconds
T
Alphabet "T"
U+0054
DateTime separator
The Time Zone can be represented as ±hh: mm
or ±hhmm
i.e with or without separators. For example, +00:00
, +0000
or +00
. However, representing -00:00
, -0000
, or -00
is not permitted. While representing a Time Zone, a plus sign must be used for positive zero values and a minus sign for negative values.
Time Zone is written at the end of a DateTime. It is not a separate data type on its own. It can only be passed when both date and time are passed to a DateTime object. "Z"
can be directly added after time without space, where "Z"
is the zone designator for the zero UTC offset. It defaults to Z or Zulu Time or Greenwich mean time (GMT) or +0:00
.
Time Zone with separators:
Time Zone without Separators:
+HH:mm
= +05:30
+HHmm
= +0530
+HH
= +05
+HH
= +05
-HH:mm
= -12:30
-HHmm
= -1230
-HH
= -12
-HH
= -12
Valid date-time representation with separator separating date, time, and time-zone
Valid date-time representation without separator separating date, time, and time-zone.
Defining date time in the Internet Object Document.
Defining date time with timezone in the Internet object Document.
When a variable is classified as an int
then it will accept all the integers including signed and unsigned integers. It will not accept float values.
The Integer must be represented using one or more ASCII digits (0 U+0030
to 9 U+0039
), prefixed with or without the minus sign (- U+002D)
. The integer must not contain a decimal point as some language validators will accept it and some will not.
Internet object specifies the following number derived types and also provides built-in support for them.
The following snippet represents a number and its derived types.
Here the applicationNo
is of integer type and will only accept integers. Similarly, rollNo
, totalScore
, percentage
and paperCode
are of different types such as int32
, int16
, number
, and byte
. Therefore they will only accept values with the defined types for the respective variable.
When a variable is classified as a byte
then the data will be accepted only if it is an integer with the size of a byte or 8 bits. A byte may have, decimal, hexadecimal, octal or binary values. The range of values is from -127
to +128
.
By default the max
value of byte
type variable is 128
and and min
is -127
.
When a variable is classified as an int32
type in the schema then it will be classified as an integer with a size of 32 bits or 4 bytes. The range of values is from -2,147,483,648
to 2,147,483,647
.
By default the max
value of byte
type variable is 2,147,483,647
and and min
is -2,147,483,648
.
Similar to Email, an URL can also be passed as a string. The URL format is derived from the recommended by W3C.
URL format follows the syntax specified in the
The Email is derived from the String type, hence it shares the same as the String. However, URL enforces additional constraints with the respective url format.
The Time is derived from the String type, hence it shares the same as the String. However, Time enforces additional constraints with the respective time format and the same is applicable to the Time MemberDef.
The DateTime is derived from the String type, hence it shares the same as the String. However, DateTime enforces additional constraints with the respective Datetime format and the same is applicable to the DateTime MemberDef.
The int
is derived from the number type that shares the sameas the Number i.e type
, default
, choices
, max
, min
, multipleOf
, divisibleBy
, optional
and null
while enforcing the additional constraint that the number must be of integer type.
The byte
is derived from the number type that shares the sameas the Number i.e type
, default
, choices
, max
, min
, multipleOf
, divisibleBy
, optional
and null
while enforcing the additional constraint that the number must be of byte type.
The int32
is derived from the number type that shares the sameas the Number i.e type
, default
, choices
, max
, min
, multipleOf
, divisibleBy
, optional
and null
while enforcing the additional constraint that the number must be of int32
type.
An Internet Object array can be defined with the members such as type
, default
, len
, minLen
, maxLen
, optional
and null
. Schema of the array TypeDef should be written as,
The TypeDef schema ensures the validity of array
MemberDefs.
The first member of the internet object array is a schema. When the schema is defined all array items must be validated against the schema. The code snippet demonstrates how the array can be defined with the schema.
The next member in the array
typedef is default
. Here is how the default values can be defined for an array.
The value of minLen
must be a non-negative integer. The array instance is valid only if, number of items in the array will be greater than or equal to the value of minLen
. The code snippet shows how to define minLen
for an array.
The value of maxlen
must be a non-negative integer. The array instance is valid only if, number of items in the array will be less than or equal to the value of the maxlen
. Here the code snippet shows how to define maxLen
for an array.
The next member in the array
typedef is length represented as len
, it must be a non-negative integer. The Array instance is valid only if, the number of items in the array will be exactly equal to the value of len
. Here is how the len
can be defined for an array.
An array when set to null
will accept null values. Here the code snippet demonstrates the way how an array can accept a null value.
A member of an array type can be set to optional. Here a code snippet demonstrates different ways how an array can be set to optional
An array type can be specified as shown in the snippet below.
Some of the valid examples of members with array type are...
The above example can be simplified as,
An array can have mixed values as shown in the snippet below.
An array containing another array represents a nested array as shown in the code snippet.
A multidimensional array is an array with more than one dimension. Two and three-dimensional arrays are called multidimensional arrays. Here is the code snippet that demonstrates how a multidimensional array is represented.
An object is the fundamental unit of Internet Object document, it can be defined with the members such as schema
, type
, default
, optional
and null
.
The TypeDef schema ensures the validity of object
MemberDefs.
In the internet object document, the object may or may not be defined with the member called schema
. But it is always recommended to define the schema for an object.
If the schema is not defined then the user can pass an object with values of any
type i.e anyOf: [string, object]
.
The above code snippet represents how the object can be defined with the typedef member schema
.
The second member of the typedef is type
. By default, the object can be of string or an object type. Here the next snippet shows how the object type can be defined.
The next member in the object
typedef is default
. Here is how the default values can be defined for an object.
The Object when set to null
will accept null values. Here the code snippet demonstrates the way how an object can accept a null value.
A member of an object type can be set to optional. Here are some of the ways through which a member of an object type can be made optional.
An empty object is useful for accepting any object value irrespective of its structure. The empty object definitions can be created using empty curly braces syntax or ignoring schema. Here are some ways in which empty object definitions can be created.
A simple object is an ordered collection of key-value pair that avoids nesting of the object and may or may not contain a child object as shown in the code snippet.
An object can be defined with the MemberDef as shown in the snippet below.
An Object can be nested inside another object. Accessing a nested object is similar to accessing a nested array. Here is the code snippet that shows how objects can be nested.
Defining dynamic schema allows users to add a dynamic object as shown in the snippet below.
A boolean data type is used to assign boolean values to the variable i.e True and False. A boolean can be defined with the members such as type
, default
, optional
and null
. Schema of the array TypeDef should be written as,
The TypeDef schema ensures the validity of bool
MemberDefs.
The first member of the bool
typedef is type
. The next snippet shows how to define a boolean type. We can pass only two values i.e true or false. It can be represented as T
, true
, F
, false
.
The next member of the bool
typedef is default
. The code snippet shows how to define a default for the bool
type. The default values are used during the processing of data/instructions if a value is not provided for a key.
A member can be marked as optional. If optional
is set to true. The value of an optional
must be boolean type i.e true
or false
. Here, are some ways the the member of bool type can be marked as optional.
When null
is set to true, a member can accept null values. The following snippet shows how to set a member of the bool type to null.
Here are some valid examples of members with bool
type...
The int16
is derived from the number type that shares the sameas the Number i.e type
, default
, choices
, max
, min
, multipleOf
, divisibleBy
, optional
and null
while enforcing the additional constraint that the number must be of int16
type.
The object can be represented as as shown here.
Internet Object allows passing extra values in the dynamic record using an asterisk sign (* U+002A)
without explicitly defining them in the schema definition.
It is invalid to pass extra values in the record without using "*"
sign in the schema definition.
In the above example, extra values are passed without using a "*"
sign which is invalid.
In order to send the extra values in the dynamic record, the "*"
sign must be placed after the last field in the schema as shown in the example below.
In the above example placing the "*"
sign after the last kay field enables the parser to map extra values to the positioned number as,
The above example shows that the extra values passed in the second record are mapped to positions 5
and 6
respectively.
Extra values can be passed using key-value pair in the data section and using"*"
in the header.
In the above example, extra values are passed using key-value pairs. The values will be mapped to the respective keys defined in the schema as well as in the record as shown below,
The TypeDef can be defined for the extra values in the schema. So that the values can be passed according to TypeDef.
The values must be passed according to the TypeDef defined in the schema otherwise it will throw an error.
The values will be mapped to the respective keys defined in the schema as shown below,
In the above example, the extra values must be of string type thus passing other values are invalid.
The schema constraints can be defined for the extra values in the schema. So that the values can be passed accordingly.
The values must be passed according to the defined constraints in the schema otherwise it will throw an error.
The values will be mapped to the respective keys defined in the schema as shown below,
In the above example the request id: "12"
is a string but the minimum length is less than that defined in the schema constrain hence it becomes invalid.
While defining a schema the object is not required to be wrap in the curly braces unless it is a child object or the schema of an array. As the schema is the valid object the same is applicable to the schema.
The schema may be wrapped in curly braces. While defining the schema, key members must be enclosed in the curly braces if the number of members is more than one.
In the above example, the address will only contain street
; city
and state
are independent keys. To add street
, city
and state
in the address
, it must be included in the curly braces as shown below,
Curly braces are optional if the associated schema accepts less than two values.
In the above example, line no 9
is invalid as the number of values passed for the address schema is more than one. The z-street, California
must be included in the curly braces as, {z-street, California}
The multiple optional and nullable members in the same schema should be defined with care as it may lead to invalid mapping.
In the above example, the street
is optional
. Value Mumbai
is passed for the city but it will be assigned to street
.This is because, if the value is passed it will be first assigned to optional
key. Therefore it is essential to define schema carefully while using optional keys.
Many time a member in the schema need to accept values from multiple types. One option would be to use any
type so that members can accept any value. However, the best option is to use anyOf
the constraint provided by the any
type. In the following example, the member test
can accept any string or number value.
An object can be represented as a MemberDef or a Schema. Object as a MemberDef can be easily differentiated from Object as a schema using some rules expressed in the flowchart below.
If the object has a type member then it is parsed as MemberDef. In the following example, testScore
is a MemberDef, as it defines the type of object.
If the object contains schema then it is a MemberDef. In the following example, testData
is a MemberDef, because it contains schema.
If the schema inside the object is set to an array then it is a MemberDef. In the following example, subjectMarks
is a MemberDef, because it contains the schema of an array.
If the object does not fall under any of the above conditions then it is not a MemberDef. It is a schema of an object. The following example represents the object schema.
If the first value in the object is a string and such as number, string, object, bool, etc. then the object is a MemberDef. In the following example, the name
is a MemberDef, because it defines the string value.
ISC (Internet Systems Consortium) License
Copyright 2020 Mohamed Aamir Maniar.
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Contributor
Contribution
Ujwala Mhashakhetri
Document Drafting
Kabir Maniar
Document Drafting and Diagrams
Frequently Asked Questions about Internet Object (In no particular order). These are the questions that have been frequently asked after the concept was previewed to the community.
One of the primary objectives of the Internet Object is to be solely a text-based human-readable serialization format. Hence, the current version of Intenet Object natively does not support direct binary data. Binary data may be escaped using the algorithms like Base64 so that it can be passed as a string value.
JSON support was not one of the objectives of creating the Internet Object. However, the final format turned out to be JSON compatible. So yes, Internet Object understands the JSON format.
Yes, Internet Object schema can validate an Internet Object document as well as a JSON object.
The uncompressed (non-gzipped) IO document is generally 40% smaller. When compared with gzipped versions of JSON and IO documents, we saw unpredicatable results. Sometimes IO document was smaller than JSON, sometimes it was around the same size. On a few occasions, the JSON document was a bit smaller than the IO version.
Internet Object is a very simple format. It is very easy to build the document just by concatenating the strings! In such cases, it is very fast to build the document. However, in reality, the performance of the parsing depends upon the parser and other factors. A well-written parser will be faster than poorly written parsers.
Great, that you would like to support Internet Object. You can contribute in many ways.
Some of them are...
Join the team that is developing an Internet Object library in your favorite language.
Write a blog or article about the Internet Object
Help friends and colleagues get started with the concept
Help us develop various technical documentations
Be a proofreader and help us correct the specification and document language
Translate the documentation in various languages
Spread the word about Internet Object
As the Greek philosopher, Heraclitus, said: “change is the only constant.”. Internet Object was created to address some of the issues found in JSON which happens to be the most prominent data serialization format today. For more information, .