OpenTelemetry 2025

1 โครงสร้างระบบ (Architecture Overview)

OpenTelemetry ใช้ Collector เป็นศูนย์กลางในการรับข้อมูลจากแอปพลิเคชันและส่งต่อไปยัง Backend ต่างๆ

OTLP (OpenTelemetry Protocol): Protocol ที่ใช้ส่งข้อมูลระหว่าง Application → Collector และ Collector → Backend

3 สิ่งที่ต้องเตรียม (Prerequisites)

พื้นฐานด้านระบบ

เข้าใจการทำงานของ Container (Docker)
รู้จัก Kubernetes พื้นฐาน
เข้าใจแนวคิด Microservices

Infrastructure

Docker และ Docker Compose ติดตั้งแล้ว
Kubernetes cluster (ถ้าใช้ K8s)
Disk space สำหรับเก็บข้อมูล

Programming

เข้าใจแนวคิด Tracing, Metrics, Logs
ประสบการณ์กับ Python/Node.js/Go
YAML/JSON configuration

Network

เข้าใจ HTTP/gRPC protocol
การเชื่อมต่อระหว่าง services
Docker network configuration

4 ขั้นตอนการติดตั้ง (Installation Guide)

1 ติดตั้ง OTel Collector ด้วย Docker

OTel Collector สามารถรันได้หลายแบบ: Docker, Kubernetes, หรือติดตั้งโดยตรงบน Server แนะนำใช้ Docker สำหรับ development และ prototype

# สร้างไฟล์ docker-compose.yml
mkdir otel-demo && cd otel-demo
cat > docker-compose.yml << 'EOF'
version: '3.8'

services:
  otel-collector:
    image: otel/opentelemetry-collector:0.106.1
    command: ["--config=/etc/otel-collector-config.yaml", "--feature-gates=-enableOTLPReceiver"]
    ports:
      - "4317:4317"     # OTLP gRPC receiver
      - "4318:4318"     # OTLP HTTP receiver
      - "55679:55679"   # zPages: diagnostics & debugging
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
      - ./data:/data
    networks:
      - otel-network

  # Sample application
  sample-app:
    image: otel/java-otel-demo-app:latest
    depends_on:
      - otel-collector
    environments:
      - OTEL_EXPORTER_OTLPEndpoint=http://otel-collector:4317
      - OTEL_EXPORTER_OTLPProtocol=grpc
    networks:
      - otel-network

networks:
  otel-network:
    driver: bridge
EOF

คำอธิบายค่าต่างๆ

- image: ใช้ Image อย่างเป็นทางการจาก OTel รุ่น 0.106.1 (ล่าสุดเมื่อ 2025)

- ports 4317: OTLP gRPC port (แนะนำสำหรับ production)

- ports 4318: OTLP HTTP port (สำหรับ client ที่ไม่รองรับ gRPC)

- ports 55679: zPages port (ดู diagnostics, debugging interface)

- OTEL_EXPORTER_OTLPEndpoint: URL ของ Collector ที่ Application จะส่งข้อมูลไป

2 สร้าง Configuration File (otel-collector-config.yaml)

Configuration ของ OTel Collector แบ่งเป็น 4 ส่วนหลัก:

receivers: ช่องทางที่ Collector รับข้อมูลเข้ามา
processors: การประมวลผลข้อมูลก่อนส่งต่อ
exporters: ช่องทางที่ส่งข้อมูลออกไปยัง Backend
pipelines: การกำหนด path ของข้อมูล

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    send_batch_size: 10000
    timeout: 10s
  memory_limiter:
    check_interval: 1s
    limit_mib: 4000
    spike_limit_mib: 500

exporters:
  logging:
    loglevel: debug
  otlp:
    endpoint: backend:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, otlp]
    metrics:
      receivers: [otlp]
      processors: [batch, memory_limiter]
      exporters: [logging]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging]

คำอธิบาย Configuration

receivers (รับข้อมูล):

grpc endpoint:4317 - รับ OTLP ผ่าน gRPC (ประสิทธิภาพสูง)
http endpoint:4318 - รับ OTLP ผ่าน HTTP (รองรับ client หลายแบบ)

processors (ประมวลผล):

batch - รวมข้อมูลหลายชุดก่อนส่ง (ลด network overhead)
memory_limiter - ป้องกันCollector ใช้หน่วยความจำมากเกินไป (OOM)
send_batch_size:10000 - ส่งข้อมูลทุก 10,000 ชุด
timeout:10s - ส่งข้อมูลทุก 10 วินาที (ไม่ถึง 10,000 ก็ส่ง)
limit_mib:4000 - จำกัดหน่วยความจำ 4GB

exporters (ส่งออก):

logging - แสดง log ที่ Console (debug)
otlp - ส่งต่อไป Collector อื่นหรือ Backend

pipelines (เส้นทางข้อมูล):

Traces Path: OTLP → batch → logging + otlp
Metrics Path: OTLP → batch, memory_limiter → logging
Logs Path: OTLP → batch → logging

3 ติดตั้ง Backend (Prometheus, Jaeger, Loki)

เพิ่ม Backend services ใน docker-compose.yml เพื่อเก็บข้อมูลที่ Collector ส่งมา

# เพิ่มใน docker-compose.yml (ส่วน exporters)
exporters:
  otlp:
    endpoint: jaeger:4317
    tls:
      insecure: true
  otlp/prometheus:
    endpoint: prometheus:9090
    tls:
      insecure: true
  otlp/loki:
    endpoint: loki:3100
    tls:
      insecure: true

# Backend services
services:
  #Jaeger (tracing backend)
  jaeger:
    image: jaegertracing/all-in-one:1.58
    ports:
      - "16686:16686"  # Jaeger UI
      - "14268:14268"  # Collector
      - "14250:14250"  # gRPC
    networks:
      - otel-network

  # Prometheus (metrics backend)
  prometheus:
    image: prom/prometheus:v2.54.0
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    networks:
      - otel-network

  # Loki (logs backend)
  loki:
    image: grafana/loki:3.2.1
    ports:
      - "3100:3100"
    networks:
      - otel-network

  # Grafana (visualization)
  grafana:
    image: grafana/grafana:11.4.0
    ports:
      - "3000:3000"
    volumes:
      - grafana-storage:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
      - GF_AUTH_DISABLE_INITIAL_ADMIN_CREATION=true
    networks:
      - otel-network
    depends_on:
      - prometheus
      - loki

volumes:
  grafana-storage:

networks:
  otel-network:

พอร์ตที่ใช้งานบ่อย

บริการ	พอร์ต	คำอธิบาย
OTel Collector (gRPC)	4317	.Standard รับข้อมูลจาก Application
OTel Collector (HTTP)	4318	Alternative รองรับ client ที่ไม่-support gRPC
Jaeger UI	16686	ดู tracing data ใน Web UI
Prometheus	9090	Query metrics และ alerts
Grafana	3000	Visualize dashboards
Loki	3100	Receive logs และ query logs

4 ติดตั้ง Instrumentation ใน Application

เพิ่ม OTel SDK ให้กับ Application เพื่อส่ง tracing, metrics, logs ไปยัง Collector

Python Example

pip install opentelemetry-distro
pip install opentelemetry-exporter-otlp

import os
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor

# Set OTLP endpoint
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = "http://otel-collector:4317"
os.environ["OTEL_EXPORTER_OTLP_PROTOCOL"] = "grpc"

# Initialize tracer
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)

# Export to console for debugging
trace.get_tracer_provider().add_span_processor(
    SimpleSpanProcessor(ConsoleSpanExporter())
)

Node.js Example

npm install @opentelemetry/api
npm install @opentelemetry/sdk-node
npm install @opentelemetry/exporter-trace-otlp-grpc

const { NodeTracerProvider } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');

const provider = new NodeTracerProvider({
  resource: new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'my-app',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0',
  }),
});

const exporter = new OTLPTraceExporter({
  url: 'http://otel-collector:4317', // OTLP/gRPC endpoint
});

provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register();

console.log('OTel initialized! Tracing started...');

Auto-Instrumentation (วิธีที่ง่ายกว่า)

OTel มี auto-instrumentation สำหรับ framework ต่างๆ ที่ไม่ต้องแก้โค้ด:

# Node.js - Express auto-instrumentation
node --require/@opentelemetry/instrumentation-express/build/src/index.js app.js

# Python - Flask auto-instrumentation
opentelemetry-instrument --traces_exporter otlp --service_name myapp python app.py

# Java - JAR-based instrumentation
java -javaagent:opentelemetry-javaagent.jar \
  -Dotel.service.name=my-java-app \
  -Dotel.exporter.otlp.endpoint=http://otel-collector:4317 \
  -jar app.jar

5 ตัวอย่างการใช้งาน (Code Examples)

การสร้าง Custom Span (Tracing)

สร้าง span แบบ custom เพื่อวัดเวลาของการดำเนินการเฉพาะ:

from flask import Flask, request
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

# Setup tracer
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
    SimpleSpanProcessor(OTLPSpanExporter(endpoint="http://otel-collector:4317"))
)
tracer = trace.get_tracer(__name__)

app = Flask(__name__)

@app.route('/api/process')
def process_data():
    with tracer.start_as_current_span("process-data") as span:
        # Add attributes to span
        span.set_attribute("http.method", request.method)
        span.set_attribute("http.url", request.path)
        span.set_attribute("user_id", "12345")
        
        # Your business logic here
        result = expensive_operation()
        
        # Set status
        span.set_status(Status(StatusCode.OK))
        return {"status": "success", "result": result}

Attributes ที่สำคัญของ Span:

Attribute	ประเภท	คำอธิบาย
span.kind	string	ภายใน：server, client, producer, consumer
http.method	string	GET, POST, PUT, DELETE
http.status_code	int	200, 404, 500, etc.
error	bool	มี error หรือไม่ (true/false)
db.statement	string	SQL query statement

การวัด Metrics (Counters, Gauges)

from flask import Flask, request
from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader

# Setup meter
metric_reader = PeriodicExportingMetricReader(OTLPMetricExporter(endpoint="http://otel-collector:4317"))
meter_provider = MeterProvider(metric_readers=[metric_reader])
metrics.set_meter_provider(meter_provider)
meter = metrics.get_meter(__name__)

# Create counters
request_counter = meter.create_counter(
    name="http_requests_total",
    description="Total number of HTTP requests",
    unit="1"
)

error_counter = meter.create_counter(
    name="http_errors_total",
    description="Total number of HTTP errors",
    unit="1"
)

# Create gauge for current requests
current_requests_gauge = meter.create_up_down_counter(
    name="http_requests_inflight",
    description="Number of requests currently in flight",
    unit="1"
)

app = Flask(__name__)

@app.route('/api/data')
def get_data():
    current_requests_gauge.add(1, {"endpoint": "/api/data"})
    
    try:
        request_counter.add(1, {"endpoint": "/api/data", "method": "GET"})
        # Process request...
        return {"data": "result"}
    except Exception as e:
        error_counter.add(1, {"endpoint": "/api/data", "error": str(e)})
        raise
    finally:
        current_requests_gauge.add(-1, {"endpoint": "/api/data"})

ประเภทของ Metrics:

Type	ใช้ทำอะไร	คุณสมบัติ
Counter	นับจำนวน (requests, errors)	เพิ่มได้อย่างเดียว (never decreases)
UpDownCounter	ขนาด queue, active connections	ขึ้นหรือลงได้ (both + and -)
Histogram	response time, latency	แจกแจงค่าเป็น histogram/bucket
Gauge	cpu usage, memory, temperature	ค่าปัจจุบันเฉพาะเวลา

6 ตัวอย่างการประยุกต์ใช้ในประเทศไทย (Use Cases)

E-Commerce Platform

วัดประสิทธิภาพของระบบสั่งซื้อสินค้า ติดตามเส้นทางการสั่งซื้อจากลูกค้าจนถึงการจัดส่ง

วัด latency ของ API แต่ละ endpoint

ติดตาม error rate ของ payment service

วัด queue size ของ order processing

Healthcare System

ตรวจสอบความเสถียรของระบบ Hospital Information System (HIS)

วัด response time ของระบบบันทึกผล lab

ติดตาม transaction ของ medicine prescription

ตรวจสอบ error ใน billing system

FinTech Application

มอนิเตอร์ระบบการเงิน ตรวจสอบ transaction แบบ real-time

วัด transaction latency ทั้งระบบ

ติดตาม error rate ของการโอนเงิน

ตรวจสอบ queue ของ payment processing

Multi-Cloud Platform

ใช้ OpenTelemetry เป็น unified pipeline สำหรับ AWS, Azure, GCP

Export ไปยัง AWS CloudWatch, X-Ray

Export ไปยัง Azure Monitor

Export ไปยัง GCP Operations (Stackdriver)

7 แก้ไขปัญหาที่พบบ่อย (Troubleshooting)

ปัญหา: ข้อมูลไม่มา Collector

Symptoms:

ไม่เห็นข้อมูลใน Jaeger/Grafana
Logs ของ Collector ไม่มี activity
Application logs บอก error การเชื่อมต่อ

ตรวจสอบทีละขั้นตอน:

# 1. ตรวจสอบว่า Collector รันอยู่หรือไม่
docker ps | grep otel-collector

# 2. ตรวจสอบ port ที่เปิดอยู่
netstat -tlnp | grep 4317
# หรือ
ss -tlnp | grep 4317

# 3. ทดสอบเชื่อมต่อกับ Collector
telnet localhost 4317
# หรือ
curl -v http://localhost:4318

# 4. ดู logs ของ Collector
docker logs otel-collector

# 5. เพิ่ม logging exporter เพื่อดูว่า Collector ได้ข้อมูลหรือไม่
exporters:
  logging:
    loglevel: debug

Common Causes:

1. Port ไม่เปิดหรือ firewall:

ตรวจสอบว่า port 4317, 4318 ถูก allowed ใน security group/firewall

2. Network isolation:

Application และ Collector อยู่คนละ network (ตรวจสอบ DNS หรือ IP)

3. Protocol mismatch:

Application ส่ง HTTP แต่ Collector รับแต่ gRPC (หรือ ngượcกัน)

ปัญหา: Collector ใช้หน่วยความจำมากเกินไป (Memory High)

Collector ใช้หน่วยความจำสูงจนเกิด OOM (Out of Memory) หรือ slow response

วิธีแก้:

processors:
  # Memory Limiter - จำกัดหน่วยความจำ
  memory_limiter:
    check_interval: 1s
    limit_mib: 4000      # จำกัด 4GB (ปรับตาม RAM ที่มี)
    spike_limit_mib: 500 # ยอมให้กระโดดสูงสุด 500MB
    
  # Batch Processor - รวมข้อมูลก่อนส่ง
  batch:
    send_batch_size: 10000  # รวมกี่ชุดก่อนส่ง
    timeout: 10s            # timeout 10s (ไม่ถึง 10k ก็ส่ง)

คำแนะนำ: ค่า limit_mib ควรตั้งที่ 50-70% ของ RAM ที่มี เช่น มี RAM 8GB ให้ตั้ง limit_mib: 5000 (5GB)

ปัญหา: Network Latency สูง

ข้อมูลส่งไป Collector ช้า หรือมี packet loss

วิธีแก้:

1. ใช้ gRPC แทน HTTP:

gRPC มี overhead น้อยกว่า HTTP/JSON มาก

2. Batch Processing:

เพิ่ม send_batch_size เพื่อลดจำนวน network call

3. Run Collector ใกล้ Application:

ใช้ sidecar pattern: รัน Collector บน container เดียวกับ Application

4. Connection Pooling:

Config exporter ให้รีใช้ connection

ปัญหา: Security / Authentication

Collector ไม่ยอมรับข้อมูลจาก application เนื่องจาก authentication

วิธีแก้:

# OTel Collector: เปิด authentication
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /path/to/cert.pem
          key_file: /path/to/key.pem
      http:
        endpoint: 0.0.0.0:4318
    auth:
      authenticator:Basicauth

# Client: ส่ง credentials
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic dXNlcm5hbWU6cGFzc3dvcmQ="

# หรือใน Python
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

exporter = OTLPSpanExporter(
    endpoint="http://collector:4317",
    headers=(("authorization", "Basic dXNlcm5hbWU6cGFzc3dvcmQ="),)
)

1 โครงสร้างระบบ (Architecture Overview)

2 สรุป (Summary)

3 สิ่งที่ต้องเตรียม (Prerequisites)

พื้นฐานด้านระบบ

Infrastructure

Programming

Network

4 ขั้นตอนการติดตั้ง (Installation Guide)

1 ติดตั้ง OTel Collector ด้วย Docker

คำอธิบายค่าต่างๆ

2 สร้าง Configuration File (otel-collector-config.yaml)

คำอธิบาย Configuration

3 ติดตั้ง Backend (Prometheus, Jaeger, Loki)

พอร์ตที่ใช้งานบ่อย

4 ติดตั้ง Instrumentation ใน Application

Python Example

Node.js Example

Auto-Instrumentation (วิธีที่ง่ายกว่า)

5 ตัวอย่างการใช้งาน (Code Examples)

การสร้าง Custom Span (Tracing)

Attributes ที่สำคัญของ Span:

การวัด Metrics (Counters, Gauges)

ประเภทของ Metrics:

6 ตัวอย่างการประยุกต์ใช้ในประเทศไทย (Use Cases)

E-Commerce Platform

Healthcare System

FinTech Application

Multi-Cloud Platform

7 แก้ไขปัญหาที่พบบ่อย (Troubleshooting)

ปัญหา: ข้อมูลไม่มา Collector

Symptoms:

ตรวจสอบทีละขั้นตอน:

Common Causes:

ปัญหา: Collector ใช้หน่วยความจำมากเกินไป (Memory High)

วิธีแก้:

ปัญหา: Network Latency สูง

วิธีแก้:

ปัญหา: Security / Authentication

วิธีแก้:

8 คำถามที่พบบ่อย (FAQ)

Q: OpenTelemetry กับ OpenTracing, OpenCensus ต่างกันยังไง?

Q: ต้องใช้ Collector หรือไม่? ใช้ SDK ตรงๆ ได้ไหม?

Q: OTLP Protocol คืออะไร? สำคัญยังไง?

Q: Metrics/Traces/Resources ต่างกันยังไง?

สรุป (Conclusion)