Log Shipping

Introduction

Qwilt tracks HTTP transactions through transaction logs, which can be used for auditing, performance analysis, or debugging. These logs are collected across the entire Qwilt deployment. To enroll in our Log Delivery Service, open a support ticket by emailing us at support@qwilt.com.

Qwilt supports sending these logs to:

Your Amazon Web Services (AWS) bucket.
Your Google Cloud Services (GCS) bucket.
Your Datadog, via the Datadog API.
Your Hydrolix account.

By default, the MDL files include all logging data. However, the transaction log output can be customized to meet your needs. Contact us at support@qwilt.com to configure any of the following log parameters:

Log file format: For AWS and GCS users, logs can be delivered in TSV (default) or JSON format. Logs shipped to Datadog are always in JSON format. This option is not relevant to Hydrolix users.
Filter by field: By default, all log fields are exported. You can choose to export only specified fields.
Filter by field value: Filter log output based on any field values. For example, you can filter for data relevant to specified Delivery Services (available to Content Publishers) and/or response code classes (2xx/3xx/4xx/5xx).
Log Sampling: Receive a subset of log data based on your preferred sampling percentage.

PII Retention Policy:

Our backend retains Personally Identifiable Information (PII) from Media Delivery Logs for 30 days. After this period, the data is automatically deleted.

Setting the Endpoint

Click the links below to expand and learn more about each option.

AWS/S3

The Log Pusher to S3 automatically sends the logs to service providers and content providers who are enrolled in a log delivery service for their service or deployment.

To set up the Log Pusher to S3, configure an AWS IAM Role to enable Qwilt to push logs to your Amazon S3 bucket, and then share the Role ARN value with Qwilt.

The following procedure provides step-by-step instructions on how to do this.

In IAM, open the Roles page. Select Create role.

IAM Roles.png

Under Trusted entity type, select Custom trust Policy. In the Custom trust policy field, paste the Trusted Policy you received from Qwilt. Below you can find the example to use.

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "AWS": "arn:aws:iam::052466545929:root"
       },
     "Action": "sts:AssumeRole"
   }
 ]
}

Click Next.

On the Add Permissions page, click Next without defining anything. On the Name, review, and create page, enter a Role name and then select Create role.

In the IAM Roles page, search for the new role you created and open it.
In the Permissions tab, select Add permissions, and then from the drop-down menu, select Create Inline Policy.

In the Specify permissions page, click the JSON button. In the Policy editor, enable the following actions:

“putObject”
“putObjectAcl”

We recommend specifying these actions as well, to facilitate troubleshooting if/when needed:

“getObject”
"getObjectAcl"
“listBucket”

Define as Resources the Arn of the S3 bucket you want to share with Qwilt (e.g., "arn:aws:s3:::qwilt_logs"), and specify all objects (e.g., "arn:aws:s3:::qwiltlogs/").

Example:

{
	"Version": "2012-10-17",
	"Statement": [
		{			"Effect":
"Allow",
			"Action": [
				"s3:PutObject",
				"s3:GetObjectAcl",
				"s3:GetObject",
				"s3:ListBucket",
				"s3:PutObjectAcl"
			],
			"Resource": [
				"arn:aws:s3:::qwilt_logs",
"arn:aws:s3:::qwilt_logs/*"
			]
		}
	]
}

Click Next.
Review the policy. Add a name to the policy that makes sense, and then select Create Policy.
In the Role page, under Summary, the Role ARN is displayed.

This ARN value needs to be shared with Qwilt in order to complete the setup.

Copy the value and share it with Qwilt, via a secure method of your choice.

<br>

<br>

GCS

The Log Pusher to GCS bucket automatically sends the logs to service providers and content providers who are enrolled in a log delivery service for their service or deployment. Transaction logs are pushed as soon as they are created; however, in the event of heavy traffic or communication issues, they may arrive up to 15 minutes later.

To set up the Log Pusher to GCS, you will need to create a service account and assign it to your GCS bucket with the correct role, and then obtain and share the access object with Qwilt.

The following procedure provides step-by-step instructions on how to do this.

Create the Service Account. At this point you will not have to grant it with any permissions. Name the account and click Done.

Search for the desired bucket. In this example, the search was filtered for “mdl”.

Click the desired bucket, and then click the Permissions tab.

Assign the bucket with the relevant service account and then grant it with the Storage Object User role.

Your credentials will need to be shared with Qwilt in order to complete the setup. Secrets should be transferred over an encrypted channel (and received secrets will be kept encrypted). Your Qwilt representative will suggest the correct method for sharing your credentials.

<br>

<br>

Datadog

Transaction logs can be shipped to Datadog via Datadog API. To enable this, contact us at support@qwilt.com.

Hydrolix

Log shipping to Hydrolix is based on a pull API by Hydrolix that pulls the needed transaction logs from a dedicated bucket from within Qwilt Cloud services.

Please contact us at support@qwilt.com to enable this.

Setup the workflow by defining and linking the following in Hydrolix:

User credentials for accessing Qwilt resources.
A table to ingest the log data.
The log fields schema, to specify which of the log fields are parsed and exposed in the table.

STEP 1: Prerequisites

An active Hydrolix account.
Qwilt Cloud (QC) Services user credentials. (Provided by Qwilt.)
JSON file with the needed log fields schema. (Provided by Qwilt.)

STEP 2: Set the QC Services User Credentials in Hydrolix

From the Hydrolix navigation bar, select Add New and then choose Credential.
In the New credential dialog, define the following:
- Name - Assign the Credential any name.
- Description - Any description.
- Cloud Provider Type - Choose AWS Access Keys.
- Access Key Id - Enter the Access Key ID provided by Qwilt.
- Secret Access Key - Enter the Secret Access Key provided by Qwilt.
Select Create credential.

STEP 3: Create a Table in Hydrolix to Ingest the Log Data

From the Hydrolix navigation bar, select Add new and then choose Table.
In the New table dialog, select a project and enter a Table name and Description.
Select Create table.

STEP 4: Define the Log Fields Schema in Hydrolix

From the Hydrolix navigation bar, select Add new and then choose Table Transform.
In the New ingest transform dialog, define the following:
- Select table - Select the table you created in the previous step.
- Transform name - Assign the Table Transform any name.
Select the Upload method.
Browse to and select the JSON file provided by Qwilt.
Select Upload Transform.

STEP 5: Create the Ingest Source

From the Hydrolix navigation bar, select Add new and then choose Table Source.
In the Select Table field, choose the table you created previously.
Under Source type, choose Auto Ingest.
Define the following:
- Name - Assign any name.
- Queue Name - The SQS queue name provided by Qwilt.
- Regex filter - The regex provided by Qwilt.
- Select transform - Choose Default.
- Source Credential - Choose the credential you created to allow access to Qwilt resources.
- Bucket Credential - Choose the credential you created to allow access to Qwilt resources.
Select Add source.

Your Qwilt-Hydrolix log shipping is now set up, and you can start reading new transaction logs as they are ingested into the table you created. Data should start arriving within approximately 5 minutes. If you do not see data within this timeframe, please contact support@qwilt.com.

Setting the Log Shipping File Format

When shipped into a cloud bucket (either S3 or GCS), the transaction logs can be stored as either a TSV (tab-separated values) or a JSON file, compressed in gzip format.

Note: Logs are created every five minutes, or as soon as the log contains 50MB of data - whichever comes first. Transaction logs are pushed as soon as they are created; however, in the event of heavy traffic or communication issues, they could arrive up to 24 hours later.

The transaction logs’ file naming convention is as follows:

<file start timestamp(YYYYMMDD-HHmmSS)>.<service-name>.<unique Id>.log.gz

For example:

20230801-055130.example-service.03495.log.gz

Log Shipping MDL Field Descriptions

The following tables describe the log fields.

Note that when a field value is not available, a hyphen (-) appears instead.

Note

For large transactions (also known as Content Range Requests or segmented transactions), Qwilt's log shipping defaults to transferring only the client transactions, referred to as 'main transactions.' These main transaction records aggregate information from all slices. Each main transaction log details the total volume served (sentBytes), the total volume read from the Origin (fetchedBytes), and the duration of all slices/sub-transactions (durationMilli). The NumSliceSubTrx field indicates the number of slices used, while NumSliceMisses represents the slices that were missed.

Disclaimer

While this article describes the full set of MDL fields, the fields included in your MDL files may vary based on the version of the Qwilt CDN serving your traffic (CP) or in your network (SP). Additionally, the order of fields in the log data may differ.

To receive the precise set of metadata fields and their order for your specific log shipping pipeline, please contact us at support@qwilt.com.

Global Fields

This table presents the full set of MDL fields in the order in which they normally appear in the transaction log. However, your log files may differ.

If your site configuration uses custom URL tokenization, additional MDL fields will be present.

To receive the precise set of metadata fields and their order for your specific log shipping pipeline, please contact us at support@qwilt.com.

Field	Data Type	Example Value(s)	Description
startTime	String	13-04-23 18:17:02.883	Unixtime of the record creation time. Date and time.
startTimeEpochMilli	Integer	1681409822883	The record creation time. The amount of milliseconds since the epoch.
durationMilli	Integer	52	Transaction duration in milliseconds.
source	String	dynamic-cdn-extention	Indicates the source of the response. Valid Values: dynamic-cdn-extension - The content was provided by an origin server. cache-delivery - The content was provided by the Qwilt CDN cache. self-generated - The response was generated by the Qwilt CDN cache. This value only applies to HTTP error responses.
siteName	String	c7081-acme-live	An internal ID assigned by Qwilt to your site. This is also known as Delivery Service.
clientIp	String: IPv4 or IPv6 address	$0:1$7PLOwpcQM7p789G4pWAw3+KvaPA=	Obfuscated client side IP.
clientPort	Integer	49222	The client-side TCP port.
serverIp	String: IPv4 or IPv6 address	211.94.171.32	The IP address of the CDN cache, from which the content was delivered.
serverPort	Integer	443	The server-side TCP port.
sentBytes	Integer	853222	(Represents the MDL field L7Goodput). The number of L7 bytes transferred to the client (including both content and HTTP response headers).
uri			This field is shipped to CPs only. Find the field description in the table, Fields Relevant Only to Content Providers.
httpResponseCode	Integer	200	The HTTP response code value.
requestRange	String	bytes=795346685-795548657	The HTTP range.
responseRange	String	bytes 795346685-795548657/986715125	The HTTP content range.
userAgent	String	Mozilla/5.0 (iPhone; CPU iPhone OS 16_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.5 Mobile/15E148 Safari/604.1	The User-Agent, if the user-agent header was included in the request.
altDeviceName	String	N/A	Request the device name header.
origServerIp	String	100.100.100.100	The origin’s server IP.
trxStatus	String	accepted	Valid Values: accepted - the transaction was served. rejected - the transaction was rejected.
trxStatusReason	String	physical-capacity-exceeded	Provides more detail about the transaction status. Valid Values: ‘-’ when the transaction was accepted. physical-capacity-exceeded - The transaction was rejected because the link capacity was reached. efficiency-policy - Transaction was rejected because resources are dedicated to prioritized transactions. set-limit-exceeded - Transaction was rejected because a configured global limit was reached.
referer	String	https://www.geeksforgeeks.org/	The Referer HTTP request header value.
httpMethod	String	get	The HTTP method used in the transaction. Valid Values: GET HEAD OPTIONS
fetchedBytes	String	853393	(Represents the MDL field L7UpstreamGoodput). The size of the HTTP content payload plus response headers for the upstream transaction.
httpVersion	String	1.1	The HTTP version.
trigger	String	external	Valid Values: external - An external client initiated the transaction. cdn - The CDN triggered the transaction.
originHostNames	String	origin1.com,origin2.com,origin3.com In this example, we can see that origin1.com was contacted first, and then origin2.com was contacted, and then origin3.com was contacted.	This field is relevant when the ‘source’ is 'dynamic-cdn-extension.' A comma-separated list of the origin servers contacted by the CDN while handling the transaction.
originHttpResponseCodes	String	503,-,200 (In this example, taken together with the example values for originHostNames, we can see that origin1.com responded with 503, origin2.com did not respond, and origin3.com responded with 200).	This field is relevant when the ‘source’ is 'dynamic-cdn-extension.' A comma-separated list of the response codes received from each of the origins listed in the ‘originHostNames’ field.
CacheStatus	String	hit	Describes how the transaction was handled by the cache. Valid Values: hit: The content was served from the local cache. miss: The content was served from the origin server, because it was not available in the cache. bypass: The content was served from the origin server, even though it was available in the cache, due to configuration or specific request information. expired: The content was served from the origin server, because the content in the cache was stale. stale: Stale content was served from the local cache. For example, if the origin server is unavailable and the configuration rules enable serving stale content when the origin server is unavailable. updating: Stale content was served from the local cache, while a background request was sent to the origin server to update it. revalidated: Stale content was revalidated against the origin server and served from the local cache, as it was still up-to-date.
deliveryHopCount	Integer	1	Contains the position of this cache in the chain of caches, from the Edge server to origin (including the Edge). The position always starts from 1, where 1 indicates that this cache is the Edge.
deliveryChainLength	Integer	2	Contains the number of CDN servers in the chain of caches from the Edge server to origin (including the Edge). Chain length is always at least 1.
transactionChainId	String	10000000397c92e9-88108992-592	For transactions delivered by the OCN, a string that uniquely identifies this transaction chain.
downstreamSystemId			This field is shipped to SPs only. Find the field description in the table, Fields Relevant Only to Service Providers.
upstreamSystemId			This field is shipped to SPs only. Find the field description in the table, Fields Relevant Only to Service Providers.
LocationAclRuleName			This field is shipped to CPs only. Find the field description in the table, Fields Relevant Only to Content Providers.
TTFB	Integer	300	Time (in milliseconds), from when the OCN received the request until it sends the first byte of response.
ClientASN	Integer	23456	ASN number of the client IP.
dataType	String	Content	Either 'content' or 'router-proxy'.
downstreamType	String	Peer	Valid Values: 'Peer': north-south. 'child': east-west. 'internal' if sub-transaction. '-' if the downstream was a client.
TriggerTrxId	String	10000004776d9176-3-1	Client transaction ID.
subTrxType	String	Slice	The transaction type. Valid Values: Slice: indicating that this is a slice/ sub-transaction. Other: Non slice.
sliceRange	String	1024-1535	For a triggered sub-transaction of the type slice, this field holds the range that the sub-transaction is in charge of.
cdnCacheStatus	String	Miss	For client-facing transactions that were sliced. If this transaction - or any sub-transaction of it - had to fetch content from the origin, then this holds as ‘miss’.
edgeLocation	String	UK	The CDN location country code.
x-forward-for	String	198.51.100.0/24,197.184.163.0/24	The client's original IP address, represented as a subnet, when connecting through one or more proxies.
originTTFB	Integer	200	Time (in milliseconds) from when the request was sent to the origin until the first byte of response was received from it.
CmcdSid	String	6e2fb550—c457—11e9—bb97—0800200c9a66	Common Media Client Data (CMCD) Session ID.
NumSliceSubTrx	Number	20	For client-facing large transactions. It holds the amount of slices and sub-transactions that were triggered.
NumSliceMisses	Number	1	For client-facing large transactions. It holds the amount of slices and sub-transactions that had a type of miss, bypass, or expired. Otherwise, it holds 0.
topTierCacheStatus	String		Contains a string that describes how the transaction was handled by the upper-most cache in the chain of caches that handled this transaction. Contains '-' if no other QN was involved in the transaction.
subnet	String	83.219.163.0/24	Contains the client IP address, partially masked as a subnet.
responseRttMsec	String	42	Socket-measured RTT (millisecond).
contentType	String	text/html;charset=utf—8	Content-Type' HTTP header of the response.
SSL-Cipher	String	TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256	The selected SSL cipher suite.
cacheKey	String		Unique identifier of a Qwilt Node (cache). For non-caching components like static content and test objects, the value may be empty or `NA`.
upstreamDuration	Numeric	11	The time (milliseconds) it took to receive content from the previous tier (another cache or the origin). This value reflects a single hop. It is measured from the receipt of the first byte to the receipt of the last byte. - The `upstreamDuration` value is always less than or equal to the `durationMilli` value. This is because the `upstreamDuration` reflects a single hop, while `durationMilli` represents the total time it took for the end-user to receive the content, including all intermediate hops. - When both of the following are true, the `upstreamDuration` specifically represents the time spent retrieving content from the origin (rather than from a mid-tier cache): the `fetchedBytes` value is greater than zero; the `deliveryHopCount` value is equal to the `deliveryChainLength` value.
ClientIpPlaintext	String	1.1.1.1	Displays the client's IP in plaintext. This is in addition to the `ClientIp` field that displays the obfuscated IP. Legal approval is required to expose the IP in plaintext. If privacy regulations prevent sharing the IP in plaintext, the field value is `'-'`. In MAP-T and Shared-IP scenarios, this field represents the internal provider IP rather than the client’s actual IP address.
xForwardedForPlaintext	String	203.0.113.1, 198.51.100.1, 192.0.2.1	If the client request includes the `X-Forwarded-For` header, this field displays the original IP address of the client in plaintext. This is in addition to the `x-forward-for` field which shows the original client IP address in subnet form. Legal approval is required to expose the IP in plaintext. If privacy regulations prevent sharing the IP in plaintext, the field value is `'-'`.
serviceType	String	vod	The type of service used to deliver the content (e.g. VOD, Live, Software Download). Maps to the `traffic-type` value in the site configuration JSON.

Fields Relevant Only to Content Providers

Field	Data Type	Example Value(s)	Description
uri	String	http://qb1.my—cdn.com/some/path	The client URL.
LocationAclRuleName	String	DENY:anyonymous-users ALLOW:123.211.0.0/16, 60.10.128.0/18, 192.123.80.0/20	Includes the action applied (allow or deny) and a description of the match condition defined by the ACL rule, such as the specified named list, CIDR blocks, ASNs, or country codes.

Fields Relevant Only to Service Providers

Field	Data Type	Example Value(s)	Description
downstreamSystemId	String		The System ID of the downstream system.
upstreamSystemId	String		The System ID of the upstream system.