Skip to content
Snippets Groups Projects
tutorial.md 21.55 KiB

Tutorial

This tutorial introduces the basics of the Document Object Model(DOM) API.

As shown in Usage at a glance, JSON can be parsed into a DOM, and then the DOM can be queried and modified easily, and finally be converted back to JSON.

Value & Document {#ValueDocument}

Each JSON value is stored in a type called Value. A Document, representing the DOM, contains the root Value of the DOM tree. All public types and functions of RapidJSON are defined in the rapidjson namespace.

Query Value {#QueryValue}

In this section, we will use excerpt of example/tutorial/tutorial.cpp.

Assume we have the following JSON stored in a C string (const char* json):

{
    "hello": "world",
    "t": true ,
    "f": false,
    "n": null,
    "i": 123,
    "pi": 3.1416,
    "a": [1, 2, 3, 4]
}

Parse it into a Document:

#include "rapidjson/document.h"

using namespace rapidjson;

// ...
Document document;
document.Parse(json);

The JSON is now parsed into document as a DOM tree:

DOM in the tutorial

Since the update to RFC 7159, the root of a conforming JSON document can be any JSON value. In earlier RFC 4627, only objects or arrays were allowed as root values. In this case, the root is an object.

assert(document.IsObject());

Let's query whether a "hello" member exists in the root object. Since a Value can contain different types of value, we may need to verify its type and use suitable API to obtain the value. In this example, "hello" member associates with a JSON string.

assert(document.HasMember("hello"));
assert(document["hello"].IsString());
printf("hello = %s\n", document["hello"].GetString());
hello = world

JSON true/false values are represented as bool.

assert(document["t"].IsBool());
printf("t = %s\n", document["t"].GetBool() ? "true" : "false");
t = true

JSON null can be queried with IsNull().

printf("n = %s\n", document["n"].IsNull() ? "null" : "?");
n = null

JSON number type represents all numeric values. However, C++ needs more specific type for manipulation.

assert(document["i"].IsNumber());

// In this case, IsUint()/IsInt64()/IsUint64() also return true.
assert(document["i"].IsInt());          
printf("i = %d\n", document["i"].GetInt());
// Alternative (int)document["i"]

assert(document["pi"].IsNumber());
assert(document["pi"].IsDouble());
printf("pi = %g\n", document["pi"].GetDouble());
i = 123
pi = 3.1416

JSON array contains a number of elements.

// Using a reference for consecutive access is handy and faster.
const Value& a = document["a"];
assert(a.IsArray());
for (SizeType i = 0; i < a.Size(); i++) // Uses SizeType instead of size_t
        printf("a[%d] = %d\n", i, a[i].GetInt());
a[0] = 1
a[1] = 2
a[2] = 3
a[3] = 4

Note that, RapidJSON does not automatically convert values between JSON types. If a value is a string, it is invalid to call GetInt(), for example. In debug mode it will fail an assertion. In release mode, the behavior is undefined.

In the following sections we discuss details about querying individual types.

Query Array {#QueryArray}

By default, SizeType is typedef of unsigned. In most systems, an array is limited to store up to 2^32-1 elements.

You may access the elements in an array by integer literal, for example, a[0], a[1], a[2].

Array is similar to std::vector: instead of using indices, you may also use iterator to access all the elements.

for (Value::ConstValueIterator itr = a.Begin(); itr != a.End(); ++itr)
    printf("%d ", itr->GetInt());

And other familiar query functions:

  • SizeType Capacity() const
  • bool Empty() const

Range-based For Loop (New in v1.1.0)

When C++11 is enabled, you can use range-based for loop to access all elements in an array.

for (auto& v : a.GetArray())
    printf("%d ", v.GetInt());

Query Object {#QueryObject}

Similar to Array, we can access all object members by iterator:

static const char* kTypeNames[] = 
    { "Null", "False", "True", "Object", "Array", "String", "Number" };

for (Value::ConstMemberIterator itr = document.MemberBegin();
    itr != document.MemberEnd(); ++itr)
{
    printf("Type of member %s is %s\n",
        itr->name.GetString(), kTypeNames[itr->value.GetType()]);
}
Type of member hello is String
Type of member t is True
Type of member f is False
Type of member n is Null
Type of member i is Number
Type of member pi is Number
Type of member a is Array

Note that, when operator[](const char*) cannot find the member, it will fail an assertion.

If we are unsure whether a member exists, we need to call HasMember() before calling operator[](const char*). However, this incurs two lookup. A better way is to call FindMember(), which can check the existence of member and obtain its value at once:

Value::ConstMemberIterator itr = document.FindMember("hello");
if (itr != document.MemberEnd())
    printf("%s\n", itr->value.GetString());

Range-based For Loop (New in v1.1.0)

When C++11 is enabled, you can use range-based for loop to access all members in an object.

for (auto& m : document.GetObject())
    printf("Type of member %s is %s\n",
        m.name.GetString(), kTypeNames[m.value.GetType()]);

Querying Number {#QueryNumber}

JSON provides a single numerical type called Number. Number can be an integer or a real number. RFC 4627 says the range of Number is specified by the parser implementation.

As C++ provides several integer and floating point number types, the DOM tries to handle these with the widest possible range and good performance.

When a Number is parsed, it is stored in the DOM as one of the following types:

Type Description
unsigned 32-bit unsigned integer
int 32-bit signed integer
uint64_t 64-bit unsigned integer
int64_t 64-bit signed integer
double 64-bit double precision floating point

When querying a number, you can check whether the number can be obtained as the target type:

Checking Obtaining
bool IsNumber() N/A
bool IsUint() unsigned GetUint()
bool IsInt() int GetInt()
bool IsUint64() uint64_t GetUint64()
bool IsInt64() int64_t GetInt64()
bool IsDouble() double GetDouble()

Note that, an integer value may be obtained in various ways without conversion. For example, A value x containing 123 will make x.IsInt() == x.IsUint() == x.IsInt64() == x.IsUint64() == true. But a value y containing -3000000000 will only make x.IsInt64() == true.

When obtaining the numeric values, GetDouble() will convert internal integer representation to a double. Note that, int and unsigned can be safely converted to double, but int64_t and uint64_t may lose precision (since mantissa of double is only 52-bits).

Query String {#QueryString}

In addition to GetString(), the Value class also contains GetStringLength(). Here explains why.

According to RFC 4627, JSON strings can contain Unicode character U+0000, which must be escaped as "\u0000". The problem is that, C/C++ often uses null-terminated string, which treats ``\0'` as the terminator symbol.

To conform RFC 4627, RapidJSON supports string containing U+0000. If you need to handle this, you can use GetStringLength() to obtain the correct string length.

For example, after parsing a the following JSON to Document d:

{ "s" :  "a\u0000b" }

The correct length of the value "a\u0000b" is 3. But strlen() returns 1.

GetStringLength() can also improve performance, as user may often need to call strlen() for allocating buffer.