ファイルからメタデータを抽出する (構造化)

ガイド Box AI Box AIのチュートリアル ファイルからメタデータを抽出する (構造化)

ファイルからメタデータを抽出する (構造化)

Box AI APIを使用すると、指定したファイルからメタデータを抽出し、結果をキー/値ペアの形式で取得することができます。入力には、fieldsパラメータを使用して構造を作成するか、すでに定義済みのメタデータテンプレートを使用できます。テンプレートの作成の詳細については、メタデータテンプレートのカスタマイズを参照するか、メタデータテンプレートAPIを使用してください。

このエンドポイントでは、以下のファイル形式がサポートされています。

PDF
TIFF
PNG
JPEG

Box AIは、画像ファイル (TIFF、PNG、JPEG) やスキャンしたドキュメントを処理する際、自動的に光学式文字認識 (OCR) を適用します。これにより、抽出前に画像をPDFに変換する必要がなくなるため、時間の節約と統合の簡略化が実現します。

Box AIは、以下の言語のドキュメントからメタデータを抽出できます。

英語
日本語
中国語
韓国語
キリル文字ベースの言語 (ロシア語、ウクライナ語、ブルガリア語、セルビア語など)

異なる言語や画像形式を使用するために追加の構成は必要ありません。Box AIは、自動的に言語を検出し、必要に応じてOCRを適用します。

Platformアプリを作成して認証するには、Box AIの使い方に記載されている手順に従っていることを確認してください。

リクエストを送信するには、POST /2.0/ai/extract_structuredエンドポイントを使用します。

cURL

curl -i -L 'https://api.box.com/2.0/ai/extract_structured' \
     -H 'content-type: application/json' \
     -H 'authorization: Bearer <ACCESS_TOKEN>' \
     -d '{
        "items": [
          {
            "id": "12345678",
            "type": "file",
            "content": "This is file content."
          }
        ],
        "metadata_template": {
            "template_key": "",
            "type": "metadata_template",
            "scope": ""
        },
        "fields": [
            {
              "key": "name",
              "description": "The name of the person.",
              "displayName": "Name",
              "prompt": "The name is the first and last name from the email address.",
              "type": "string",
              "options": [
                {
                  "key": "First Name"
                },
                {
                  "key": "Last Name"
                }
              ]
            }
        ],
        "ai_agent": {
          "type": "ai_agent_extract_structured",
          "long_text": {
            "model": "azure__openai__gpt_4o_mini"
            },
          "basic_text": {
            "model": "azure__openai__gpt_4o_mini"
         }
      }
   }'

Node/TypeScript v10

await client.ai.createAiExtractStructured({
  fields: [
    {
      key: 'firstName',
      displayName: 'First name',
      description: 'Person first name',
      prompt: 'What is the your first name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'lastName',
      displayName: 'Last name',
      description: 'Person last name',
      prompt: 'What is the your last name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'dateOfBirth',
      displayName: 'Birth date',
      description: 'Person date of birth',
      prompt: 'What is the date of your birth?',
      type: 'date',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'age',
      displayName: 'Age',
      description: 'Person age',
      prompt: 'How old are you?',
      type: 'float',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'hobby',
      displayName: 'Hobby',
      description: 'Person hobby',
      prompt: 'What is your hobby?',
      type: 'multiSelect',
      options: [
        { key: 'guitar' } satisfies AiExtractStructuredFieldsOptionsField,
        { key: 'books' } satisfies AiExtractStructuredFieldsOptionsField,
      ],
    } satisfies AiExtractStructuredFieldsField,
  ],
  items: [new AiItemBase({ id: file.id })],
  aiAgent: aiExtractStructuredAgentBasicTextConfig,
} satisfies AiExtractStructured);

Python v10

client.ai.create_ai_extract_structured(
    [AiItemBase(id=file.id)],
    fields=[
        CreateAiExtractStructuredFields(
            key="firstName",
            display_name="First name",
            description="Person first name",
            prompt="What is the your first name?",
            type="string",
        ),
        CreateAiExtractStructuredFields(
            key="lastName",
            display_name="Last name",
            description="Person last name",
            prompt="What is the your last name?",
            type="string",
        ),
        CreateAiExtractStructuredFields(
            key="dateOfBirth",
            display_name="Birth date",
            description="Person date of birth",
            prompt="What is the date of your birth?",
            type="date",
        ),
        CreateAiExtractStructuredFields(
            key="age",
            display_name="Age",
            description="Person age",
            prompt="How old are you?",
            type="float",
        ),
        CreateAiExtractStructuredFields(
            key="hobby",
            display_name="Hobby",
            description="Person hobby",
            prompt="What is your hobby?",
            type="multiSelect",
            options=[
                CreateAiExtractStructuredFieldsOptionsField(key="guitar"),
                CreateAiExtractStructuredFieldsOptionsField(key="books"),
            ],
        ),
    ],
    ai_agent=ai_extract_structured_agent_basic_text_config,
)

.NET v10

await client.Ai.CreateAiExtractStructuredAsync(requestBody: new AiExtractStructured(items: Array.AsReadOnly(new [] {new AiItemBase(id: file.Id)})) { Fields = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsField(key: "firstName") { DisplayName = "First name", Description = "Person first name", Prompt = "What is the your first name?", Type = "string" },new AiExtractStructuredFieldsField(key: "lastName") { DisplayName = "Last name", Description = "Person last name", Prompt = "What is the your last name?", Type = "string" },new AiExtractStructuredFieldsField(key: "dateOfBirth") { DisplayName = "Birth date", Description = "Person date of birth", Prompt = "What is the date of your birth?", Type = "date" },new AiExtractStructuredFieldsField(key: "age") { DisplayName = "Age", Description = "Person age", Prompt = "How old are you?", Type = "float" },new AiExtractStructuredFieldsField(key: "hobby") { DisplayName = "Hobby", Description = "Person hobby", Prompt = "What is your hobby?", Type = "multiSelect", Options = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsOptionsField(key: "guitar"),new AiExtractStructuredFieldsOptionsField(key: "books")}) }}) });

Swift v10

try await client.ai.createAiExtractStructured(requestBody: AiExtractStructured(fields: [AiExtractStructuredFieldsField(key: "firstName", displayName: "First name", description: "Person first name", prompt: "What is the your first name?", type: "string"), AiExtractStructuredFieldsField(key: "lastName", displayName: "Last name", description: "Person last name", prompt: "What is the your last name?", type: "string"), AiExtractStructuredFieldsField(key: "dateOfBirth", displayName: "Birth date", description: "Person date of birth", prompt: "What is the date of your birth?", type: "date"), AiExtractStructuredFieldsField(key: "age", displayName: "Age", description: "Person age", prompt: "How old are you?", type: "float"), AiExtractStructuredFieldsField(key: "hobby", displayName: "Hobby", description: "Person hobby", prompt: "What is your hobby?", type: "multiSelect", options: [AiExtractStructuredFieldsOptionsField(key: "guitar"), AiExtractStructuredFieldsOptionsField(key: "books")])], items: [AiItemBase(id: file.id)]))

Java v10

client.getAi().createAiExtractStructured(new AiExtractStructured.Builder(Arrays.asList(new AiItemBase(file.getId()))).fields(Arrays.asList(new AiExtractStructuredFieldsField.Builder("firstName").description("Person first name").displayName("First name").prompt("What is the your first name?").type("string").build(), new AiExtractStructuredFieldsField.Builder("lastName").description("Person last name").displayName("Last name").prompt("What is the your last name?").type("string").build(), new AiExtractStructuredFieldsField.Builder("dateOfBirth").description("Person date of birth").displayName("Birth date").prompt("What is the date of your birth?").type("date").build(), new AiExtractStructuredFieldsField.Builder("age").description("Person age").displayName("Age").prompt("How old are you?").type("float").build(), new AiExtractStructuredFieldsField.Builder("hobby").description("Person hobby").displayName("Hobby").prompt("What is your hobby?").type("multiSelect").options(Arrays.asList(new AiExtractStructuredFieldsOptionsField("guitar"), new AiExtractStructuredFieldsOptionsField("books"))).build())).aiAgent(aiExtractStructuredAgentBasicTextConfig).build())

.NET v6

await client.Ai.CreateAiExtractStructuredAsync(requestBody: new AiExtractStructured(items: Array.AsReadOnly(new [] {new AiItemBase(id: file.Id)})) { Fields = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsField(key: "firstName") { DisplayName = "First name", Description = "Person first name", Prompt = "What is the your first name?", Type = "string" },new AiExtractStructuredFieldsField(key: "lastName") { DisplayName = "Last name", Description = "Person last name", Prompt = "What is the your last name?", Type = "string" },new AiExtractStructuredFieldsField(key: "dateOfBirth") { DisplayName = "Birth date", Description = "Person date of birth", Prompt = "What is the date of your birth?", Type = "date" },new AiExtractStructuredFieldsField(key: "age") { DisplayName = "Age", Description = "Person age", Prompt = "How old are you?", Type = "float" },new AiExtractStructuredFieldsField(key: "hobby") { DisplayName = "Hobby", Description = "Person hobby", Prompt = "What is your hobby?", Type = "multiSelect", Options = Array.AsReadOnly(new [] {new AiExtractStructuredFieldsOptionsField(key: "guitar"),new AiExtractStructuredFieldsOptionsField(key: "books")}) }}), IncludeConfidenceScore = true });

Node v4

await client.ai.createAiExtractStructured({
  fields: [
    {
      key: 'firstName',
      displayName: 'First name',
      description: 'Person first name',
      prompt: 'What is the your first name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'lastName',
      displayName: 'Last name',
      description: 'Person last name',
      prompt: 'What is the your last name?',
      type: 'string',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'dateOfBirth',
      displayName: 'Birth date',
      description: 'Person date of birth',
      prompt: 'What is the date of your birth?',
      type: 'date',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'age',
      displayName: 'Age',
      description: 'Person age',
      prompt: 'How old are you?',
      type: 'float',
    } satisfies AiExtractStructuredFieldsField,
    {
      key: 'hobby',
      displayName: 'Hobby',
      description: 'Person hobby',
      prompt: 'What is your hobby?',
      type: 'multiSelect',
      options: [
        { key: 'guitar' } satisfies AiExtractStructuredFieldsOptionsField,
        { key: 'books' } satisfies AiExtractStructuredFieldsOptionsField,
      ],
    } satisfies AiExtractStructuredFieldsField,
  ],
  items: [new AiItemBase({ id: file.id })],
  includeConfidenceScore: true,
  aiAgent: aiExtractStructuredAgentBasicTextConfig,
} satisfies AiExtractStructured);

コールを実行するには、以下のパラメータを渡す必要があります。必須のパラメータは太字で示されています。

items配列に含めることができる要素は1つだけです。

パラメータ	説明	例
`metadata_template`	抽出するフィールドを含むメタデータテンプレート。リクエストを機能させるには、`metadata_template`または`fields`を指定する必要がありますが、両方を指定することはできません。
`metadata_template.type`	メタデータテンプレートのタイプ。	`metadata_template`
`metadata_template.scope`	メタデータテンプレートのスコープ。`global`または`enterprise`のいずれかになります。globalテンプレートは、任意のBox Enterpriseで利用できますが、`enterprise`テンプレートは特定のEnterpriseに関連付けられます。	`metadata_template`
`metadata_template.template_key`	メタデータテンプレートの名前。	`invoice`
`items.id`	ドキュメントのBoxファイルID。IDは、拡張子が付いている実際のファイルを参照する必要があります。	`1233039227512`
`items.type`	指定した入力データのタイプ。	`file`
`items.content`	項目のコンテンツ (多くの場合はテキストレプリゼンテーション)。	`This article is about Box AI`.
`fields.type`	フィールドのタイプ。これには、`string`、`float`、`date`、`enum`、`multiSelect`が含まれますが、これらに限定されるものではありません。	`string`
`fields.description`	フィールドの説明。	`The person's name.`
`fields.displayName`	フィールドの表示名。	`Name`
`fields.key`	フィールドの一意の識別子。	`name`
`fields.options`	このフィールドのオプションのリスト。ほとんどの場合、`enum`および`multiSelect`フィールドタイプと組み合わせて使用します。	`[{"key":"First Name"},{"key":"Last Name"}]`
`fields.options.key`	フィールドの一意の識別子。	`First Name`
`fields.prompt`	キー (識別子) に関する追加のコンテキスト。キーの確認方法や形式を含めることができます。	`Name is the first and last name from the email address`
`ai_agent`	デフォルトのエージェント構成を上書きするために使用されるAIエージェント。このパラメータを使用すると、たとえば、`model`パラメータを使用してデフォルトのLLMをカスタムのLLMに置き換えたり、よりカスタマイズされたユーザーエクスペリエンスを実現できるようにベースとなる`prompt`を微調整したり、`temperature`などのLLMパラメータを変更して結果の創造性を調整したりすることができます。`ai_agent`パラメータを使用する前に、`GET 2.0/ai_agent_default`リクエストを使用してデフォルト構成を取得できます。具体的なユースケースについては、AIモデルの上書きに関するチュートリアルを参照してください。

この例では、サンプル請求書から構造化された形でメタデータを抽出する方法を示します。ベンダー名、請求書番号などの詳細情報を抽出する必要があるとします。

Box AIから応答を取得するには、以下のパラメータを使用して、POST /2.0/ai/extract_structuredエンドポイントを呼び出します。

items.typeおよびitems.id: データの抽出元となるファイルを指定します。
fields: 指定したファイルから抽出するデータを指定します。
metadata_template: 既存のメタデータテンプレートを指定します。

fieldsとmetadata_templateのどちらかを使用して、構造を指定できます。両方を使用することはできません。

fieldsパラメータを使用すると、抽出するデータを指定できます。各fieldsオブジェクトにはパラメータのサブセットがあり、それを使用して、検索対象のデータに関する情報を追加できます。たとえば、フィールドのタイプや説明、さらには追加のコンテキストを含めたプロンプトを追加することができます。

curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>'' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "fields": [
        {
            "key": "document_type",
            "type": "enum",
            "prompt": "what type of document is this?",
            "options": [
                {
                    "key": "Invoice"
                },
                {
                    "key": "Purchase Order"
                },
                {
                    "key": "Unknown"
                }
            ]
        },
        {
            "key": "document_date",
            "type": "date"
        },
        {
            "key": "vendor",
            "description": "The name of the entity.",
            "prompt": "Which vendor is sending this document.",
            "type": "string"
        },
        {
            "key": "document_total",
            "type": "float"
        }
    ]
  }'

応答には、以下のように、指定したフィールドとその値が示されます。

{
    "document_date": "2024-02-13",
    "vendor": "Quasar Innovations",
    "document_total": $1050,
    "document_type": "Purchase Order"
}

メタデータテンプレートを使用する場合は、そのtemplate_key、type、scopeを指定します。

curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <ACCESS_TOKEN>' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "metadata_template": {
        "template_key": "rbInvoicePO",
        "type": "metadata_template",
        "scope": "enterprise_1134207681"
    }
}'

応答には、以下のように、メタデータテンプレートに含まれているフィールドとその値が示されます。

{
  "documentDate": "February 13, 2024",
  "total": "$1050",
  "documentType": "Purchase Order",
  "vendor": "Quasar Innovations",
  "purchaseOrderNumber": "003"
}

抽出エージェント (強化) を使用するには、次のようにai_agentオブジェクトを指定します。

{
  "ai_agent": {
    "type": "ai_agent_id", 
    "id": "enhanced_extract_agent"
  }
}

抽出エージェント (強化) を使用してデータを抽出するには、以下のいずれかが必要です。

インラインでのフィールド定義 (フィールドが頻繁に変わる場合に最適)
メタデータテンプレート (フィールドが一定である場合に最適)

Box Python SDKを使用したサンプルのコードスニペットを確認してください。

from box_sdk_gen import (
    AiAgentReference,
    AiAgentReferenceTypeField,
    AiItemBase,
    AiItemBaseTypeField,
    BoxClient,
    BoxCCGAuth,
    CCGConfig,
    CreateAiExtractStructuredMetadataTemplate
)

# Create your client credentials grant config from the developer console
ccg_config = CCGConfig(
    client_id="my_box_client_id", # replace with your client id
    client_secret="my_box_client_secret", # replace with your client secret
    user_id="my_box_user_id", # replace with the box user id that has access
                              # to the file you are referencing
)
auth = BoxCCGAuth(config=ccg_config)
client = BoxClient(auth=auth)
# Create the agent config referencing the enhanced extract agent
enhanced_extract_agent_config = AiAgentReference(
    id="enhanced_extract_agent",
    type=AiAgentReferenceTypeField.AI_AGENT_ID
)
# Use the Box SDK to call the extract_structured endpoint
box_ai_response = client.ai.create_ai_extract_structured(
    # Create the items array containing the file information to extract from
    items=[
        AiItemBase(
            id="my_box_file_id", # replace with the file id
            type=AiItemBaseTypeField.FILE
        )
    ],
    # Reference the Box Metadata template 
    metadata_template=CreateAiExtractStructuredMetadataTemplate(
        template_key="InvoicePO",
        scope="enterprise"
    ),
    # Attach the agent config you created earlier
    ai_agent=enhanced_extract_agent_config,
)
print(f"box_ai_response: {box_ai_response.answer}")

ファイルからメタデータを抽出する (構造化)

ファイルからメタデータを抽出する (構造化)

サポートされているファイル形式

サポートされている言語

開始する前に

リクエストの送信

パラメータ

ユースケース

リクエストの作成

`fields`パラメータの使用

メタデータテンプレートの使用

抽出エージェント (強化)

関連するガイド