Skip to content

MongoDB → PostgreSQL 日志迁移方案

📋 改造目标

  • ✅ 移除MongoDB依赖,统一使用PostgreSQL存储
  • ✅ 保持现有注解API不变(无业务代码改动)
  • ✅ 简化部署架构,减少中间件维护成本
  • ✅ 支持海量日志存储(分区表设计)

🎯 日志模型清单

序号模型类模块用途
1BeyondSoftLoginLogbeyondsoft-log-spring3-starter登录日志
2BeyondSoftRequestLogbeyondsoft-log-spring3-starter请求日志
3BeyondSoftDataAuditLogbeyondsoft-log-spring3-starter审计日志
4BeyondSoftThirdPartLogbeyondsoft-web-spring3-starter第三方调用日志

📊 数据库表设计(针对海量日志优化)

设计原则

  1. 分区表策略:按天自动分区(RANGE PARTITION),海量日志场景的最佳实践
  2. 索引优化:只对高频查询字段建索引,避免写入性能下降
  3. 字段类型:TEXT字段存储大文本,JSONB存储结构化数据
  4. 无自动删除:通过定期归档脚本清理历史数据

分区策略选择

日志量级推荐策略单分区大小查询性能
日均 < 10万条按月分区~60GB⭐⭐⭐
日均 10-50万条按周分区~14GB⭐⭐⭐⭐
日均 > 50万条按天分区~2GB⭐⭐⭐⭐⭐

1. 登录日志表(按天分区)

sql
-- 主表
CREATE TABLE beyond_soft_login_log (
    id BIGSERIAL,
    username VARCHAR(100),
    ip VARCHAR(50),
    location VARCHAR(200),
    platform VARCHAR(50),
    browser VARCHAR(100),
    os VARCHAR(100),
    status VARCHAR(50) DEFAULT '登录成功',
    response TEXT,
    request_time TIMESTAMP NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id, request_time)  -- 分区键必须包含在主键中
) PARTITION BY RANGE (request_time);

-- 创建索引(在主表上创建会自动继承到分区)
CREATE INDEX idx_login_log_username ON beyond_soft_login_log(username);
CREATE INDEX idx_login_log_request_time ON beyond_soft_login_log(request_time);
CREATE INDEX idx_login_log_ip ON beyond_soft_login_log(ip);

-- 创建分区示例(按天)
CREATE TABLE beyond_soft_login_log_2025_01_01 PARTITION OF beyond_soft_login_log
    FOR VALUES FROM ('2025-01-01') TO ('2025-01-02');

CREATE TABLE beyond_soft_login_log_2025_01_02 PARTITION OF beyond_soft_login_log
    FOR VALUES FROM ('2025-01-02') TO ('2025-01-03');

-- 后续日期的分区可通过脚本自动创建

分区管理SQL(自动创建未来7天分区):

sql
CREATE OR REPLACE FUNCTION create_login_log_partitions()
RETURNS void AS $$
DECLARE
    start_date DATE;
    end_date DATE;
    partition_name TEXT;
    i INT;
BEGIN
    FOR i IN 0..6 LOOP  -- 创建未来7天的分区
        start_date := CURRENT_DATE + i;
        end_date := start_date + INTERVAL '1 day';
        partition_name := 'beyond_soft_login_log_' || to_char(start_date, 'YYYY_MM_DD');
        
        -- 检查分区是否存在
        IF NOT EXISTS (
            SELECT 1 FROM pg_tables WHERE tablename = partition_name
        ) THEN
            EXECUTE format(
                'CREATE TABLE %I PARTITION OF beyond_soft_login_log FOR VALUES FROM (%L) TO (%L)',
                partition_name, start_date, end_date
            );
            RAISE NOTICE '创建分区: %', partition_name;
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

-- 每天自动执行(配合pg_cron扩展)
-- SELECT cron.schedule('create-log-partitions', '0 2 * * *', 'SELECT create_login_log_partitions()');  -- 每天凌晨2点执行

2. 请求日志表(按天分区)

sql
-- 主表
CREATE TABLE beyond_soft_request_log (
    id BIGSERIAL,
    ip VARCHAR(50),
    location VARCHAR(200),
    request_type VARCHAR(20),
    url VARCHAR(1000),
    request_params TEXT,
    content_type VARCHAR(100),
    operator VARCHAR(100),
    user_agent TEXT,
    start_time TIMESTAMP,
    end_time TIMESTAMP,
    spend_time BIGINT,
    status INTEGER,
    response TEXT,
    error TEXT,
    trace_id VARCHAR(100),
    request_time TIMESTAMP NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id, request_time)
) PARTITION BY RANGE (request_time);

-- 索引(只对高频查询字段)
CREATE INDEX idx_request_log_url ON beyond_soft_request_log USING hash(url);  -- hash索引适合精确匹配
CREATE INDEX idx_request_log_trace_id ON beyond_soft_request_log(trace_id);
CREATE INDEX idx_request_log_request_time ON beyond_soft_request_log(request_time);
CREATE INDEX idx_request_log_operator ON beyond_soft_request_log(operator);

-- 分区创建(按天)
CREATE TABLE beyond_soft_request_log_2025_01_01 PARTITION OF beyond_soft_request_log
    FOR VALUES FROM ('2025-01-01') TO ('2025-01-02');

CREATE TABLE beyond_soft_request_log_2025_01_02 PARTITION OF beyond_soft_request_log
    FOR VALUES FROM ('2025-01-02') TO ('2025-01-03');

3. 审计日志表(按天分区,JSONB优化)

sql
-- 主表
CREATE TABLE beyond_soft_data_audit_log (
    id BIGSERIAL,
    operator VARCHAR(100),
    modify_date TIMESTAMP,
    operation VARCHAR(50),
    handle_name VARCHAR(200),
    modify_content TEXT,
    modifier_ip VARCHAR(50),
    modifier_location VARCHAR(200),
    business_id VARCHAR(100),
    business_data_status VARCHAR(50),
    old_object JSONB,  -- PostgreSQL原生JSON支持
    new_object JSONB,
    response TEXT,
    request_time TIMESTAMP NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id, request_time)
) PARTITION BY RANGE (request_time);

-- 索引
CREATE INDEX idx_audit_log_operator ON beyond_soft_data_audit_log(operator);
CREATE INDEX idx_audit_log_business_id ON beyond_soft_data_audit_log(business_id);
CREATE INDEX idx_audit_log_request_time ON beyond_soft_data_audit_log(request_time);
CREATE INDEX idx_audit_log_operation ON beyond_soft_data_audit_log(operation);

-- JSONB字段的GIN索引(支持JSON查询)
CREATE INDEX idx_audit_log_old_object_gin ON beyond_soft_data_audit_log USING gin(old_object);
CREATE INDEX idx_audit_log_new_object_gin ON beyond_soft_data_audit_log USING gin(new_object);

-- 分区创建(按天)
CREATE TABLE beyond_soft_data_audit_log_2025_01_01 PARTITION OF beyond_soft_data_audit_log
    FOR VALUES FROM ('2025-01-01') TO ('2025-01-02');

CREATE TABLE beyond_soft_data_audit_log_2025_01_02 PARTITION OF beyond_soft_data_audit_log
    FOR VALUES FROM ('2025-01-02') TO ('2025-01-03');

JSONB查询示例:

sql
-- 查询old_object中包含特定字段的记录
SELECT * FROM beyond_soft_data_audit_log 
WHERE old_object @> '{"status": "draft"}';

-- 查询new_object中某字段的值
SELECT * FROM beyond_soft_data_audit_log 
WHERE new_object->>'userId' = '12345';

4. 第三方调用日志表(按天分区)

sql
-- 主表
CREATE TABLE beyond_soft_third_part_log (
    id BIGSERIAL,
    request_id VARCHAR(100),
    protocols VARCHAR(20),
    url VARCHAR(1000),
    query_params TEXT,
    body TEXT,
    call_source VARCHAR(200),
    method_name VARCHAR(200),
    headers JSONB,  -- Map<String, String> 存储为JSONB
    start_time TIMESTAMP NOT NULL,
    end_time TIMESTAMP,
    spend_time BIGINT,
    response TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id, start_time)
) PARTITION BY RANGE (start_time);

-- 索引
CREATE INDEX idx_third_part_log_request_id ON beyond_soft_third_part_log(request_id);
CREATE INDEX idx_third_part_log_url ON beyond_soft_third_part_log USING hash(url);
CREATE INDEX idx_third_part_log_call_source ON beyond_soft_third_part_log(call_source);
CREATE INDEX idx_third_part_log_start_time ON beyond_soft_third_part_log(start_time);

-- 分区创建(按天)
CREATE TABLE beyond_soft_third_part_log_2025_01_01 PARTITION OF beyond_soft_third_part_log
    FOR VALUES FROM ('2025-01-01') TO ('2025-01-02');

CREATE TABLE beyond_soft_third_part_log_2025_01_02 PARTITION OF beyond_soft_third_part_log
    FOR VALUES FROM ('2025-01-02') TO ('2025-01-03');

🔧 分区管理脚本

自动创建所有日志表的分区

sql
CREATE OR REPLACE FUNCTION create_all_log_partitions(days_ahead INT DEFAULT 7)
RETURNS void AS $$
DECLARE
    start_date DATE;
    end_date DATE;
    partition_suffix TEXT;
    tables TEXT[] := ARRAY[
        'beyond_soft_login_log',
        'beyond_soft_request_log',
        'beyond_soft_data_audit_log',
        'beyond_soft_third_part_log'
    ];
    table_name TEXT;
    partition_name TEXT;
    i INT;
BEGIN
    FOREACH table_name IN ARRAY tables LOOP
        FOR i IN 0..days_ahead-1 LOOP  -- 创建未来N天的分区
            start_date := CURRENT_DATE + i;
            end_date := start_date + INTERVAL '1 day';
            partition_suffix := to_char(start_date, 'YYYY_MM_DD');
            partition_name := table_name || '_' || partition_suffix;
            
            IF NOT EXISTS (
                SELECT 1 FROM pg_tables WHERE tablename = partition_name
            ) THEN
                EXECUTE format(
                    'CREATE TABLE %I PARTITION OF %I FOR VALUES FROM (%L) TO (%L)',
                    partition_name, table_name, start_date, end_date
                );
                RAISE NOTICE '创建分区: %', partition_name;
            END IF;
        END LOOP;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

-- 手动执行创建分区(默认创建未来7天)
SELECT create_all_log_partitions();

-- 每天自动执行(配合pg_cron扩展)
-- SELECT cron.schedule('create-all-log-partitions', '0 2 * * *', 'SELECT create_all_log_partitions(7)');

定期清理历史分区(保留90天)

sql
CREATE OR REPLACE FUNCTION cleanup_old_log_partitions(days_to_keep INT DEFAULT 90)
RETURNS void AS $$
DECLARE
    cutoff_date DATE;
    partition_record RECORD;
    partition_date DATE;
BEGIN
    cutoff_date := CURRENT_DATE - days_to_keep;
    
    FOR partition_record IN
        SELECT tablename FROM pg_tables
        WHERE tablename ~ '^beyond_soft_(login|request|data_audit|third_part)_log_\d{4}_\d{2}_\d{2}$'
    LOOP
        -- 从表名提取日期(例如:beyond_soft_login_log_2024_06_15 -> 2024-06-15)
        partition_date := to_date(
            substring(partition_record.tablename from '\d{4}_\d{2}_\d{2}$'),
            'YYYY_MM_DD'
        );
        
        IF partition_date < cutoff_date THEN
            -- 先备份到归档表(可选)
            EXECUTE format(
                'CREATE TABLE IF NOT EXISTS %I_archive (LIKE %I INCLUDING ALL)',
                regexp_replace(partition_record.tablename, '_\d{4}_\d{2}_\d{2}$', ''),
                partition_record.tablename
            );
            EXECUTE format(
                'INSERT INTO %I_archive SELECT * FROM %I',
                regexp_replace(partition_record.tablename, '_\d{4}_\d{2}_\d{2}$', ''),
                partition_record.tablename
            );
            
            -- 删除分区
            EXECUTE format('DROP TABLE %I', partition_record.tablename);
            RAISE NOTICE '已删除分区: %(已归档)', partition_record.tablename;
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

-- 手动执行清理(保留90天)
-- SELECT cleanup_old_log_partitions(90);

-- 每周自动执行(配合pg_cron扩展)
-- SELECT cron.schedule('cleanup-old-log-partitions', '0 3 * * 0', 'SELECT cleanup_old_log_partitions(90)');

📈 性能优化建议

1. 写入优化

sql
-- 调整PostgreSQL配置(postgresql.conf)
shared_buffers = 256MB                  # 增加共享缓冲区
work_mem = 16MB                         # 增加工作内存
maintenance_work_mem = 128MB            # 增加维护内存
wal_buffers = 16MB                      # 增加WAL缓冲区
checkpoint_completion_target = 0.9      # 延长检查点完成时间
max_wal_size = 2GB                      # 增加WAL大小

-- 对日志表禁用autovacuum(手动定期vacuum)
ALTER TABLE beyond_soft_request_log SET (autovacuum_enabled = false);

2. 查询优化

sql
-- 定期更新统计信息
ANALYZE beyond_soft_login_log;
ANALYZE beyond_soft_request_log;
ANALYZE beyond_soft_data_audit_log;
ANALYZE beyond_soft_third_part_log;

-- 对TEXT字段启用压缩
ALTER TABLE beyond_soft_request_log ALTER COLUMN response SET STORAGE EXTERNAL;
ALTER TABLE beyond_soft_data_audit_log ALTER COLUMN response SET STORAGE EXTERNAL;

3. 批量写入优化

java
// 在Service层使用批量插入(每次500条)
@Transactional
public void saveBatch(List<T> logs) {
    int batchSize = 500;
    for (int i = 0; i < logs.size(); i += batchSize) {
        int end = Math.min(i + batchSize, logs.size());
        List<T> batch = logs.subList(i, end);
        mapper.insertBatch(batch);
    }
}

🔍 监控与告警

1. 分区监控

sql
-- 查看所有分区及大小
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE tablename ~ '^beyond_soft_.*_log'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- 检查缺失的分区(未来7天)
SELECT 
    'beyond_soft_login_log_' || to_char(generate_series, 'YYYY_MM_DD') AS expected_partition
FROM generate_series(
    CURRENT_DATE,
    CURRENT_DATE + INTERVAL '6 days',
    INTERVAL '1 day'
) AS generate_series
WHERE NOT EXISTS (
    SELECT 1 FROM pg_tables 
    WHERE tablename = 'beyond_soft_login_log_' || to_char(generate_series, 'YYYY_MM_DD')
);

2. 写入性能监控

sql
-- 查看日志表的写入TPS
SELECT 
    schemaname,
    tablename,
    n_tup_ins AS inserts,
    n_tup_upd AS updates,
    n_tup_del AS deletes
FROM pg_stat_user_tables
WHERE tablename ~ '^beyond_soft_.*_log'
ORDER BY n_tup_ins DESC;

🚀 技术选型:MyBatis-Flex

理由

  1. ✅ 项目已集成,无需额外依赖
  2. ✅ 动态查询能力强,支持复杂条件
  3. ✅ 性能优于JPA
  4. ✅ 与现有代码风格统一

核心改造点

1. 依赖调整

xml
<!-- pom.xml - 移除MongoDB -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-mongodb</artifactId>
    <scope>provided</scope>  <!-- 改为provided,让业务项目可选 -->
</dependency>

<!-- 添加PostgreSQL驱动(已在parent中管理) -->
<!-- 无需额外添加,使用项目现有的MyBatis-Flex -->

2. 创建Mapper接口(4个)

java
// LoginLogMapper.java
@Mapper
public interface LoginLogMapper extends BaseMapper<BeyondSoftLoginLog> {
}

// RequestLogMapper.java
@Mapper
public interface RequestLogMapper extends BaseMapper<BeyondSoftRequestLog> {
}

// DataAuditLogMapper.java
@Mapper
public interface DataAuditLogMapper extends BaseMapper<BeyondSoftDataAuditLog> {
}

// ThirdPartLogMapper.java(在web-starter模块)
@Mapper
public interface ThirdPartLogMapper extends BaseMapper<BeyondSoftThirdPartLog> {
}

3. 修改Service实现

java
// 将MongoTemplate替换为Mapper
// 异步写入改为批量写入
// 查询条件从MongoDB正则改为SQL LIKE

📝 实施步骤

Phase 1: 数据库准备(1天)

  1. ✅ 创建PostgreSQL数据库
  2. ✅ 执行DDL脚本(创建主表)
  3. ✅ 执行分区创建函数
  4. ✅ 创建未来3个月的分区

Phase 2: 代码改造(3天)

  1. ✅ 修改4个日志模型类(添加JPA注解)
  2. ✅ 创建4个Mapper接口
  3. ✅ 重写BeyondSoftLogServiceImpl
  4. ✅ 更新查询Controller
  5. ✅ 修改自动配置类(条件化Bean)

Phase 3: 测试验证(2天)

  1. ✅ 单元测试(4种日志类型)
  2. ✅ 集成测试(批量写入性能)
  3. ✅ 压力测试(10000条/秒)
  4. ✅ 查询API测试

Phase 4: 灰度发布(1周)

  1. ✅ 配置双写(同时写MongoDB和PostgreSQL)
  2. ✅ 对比数据一致性
  3. ✅ 逐步切换读取源
  4. ✅ 全量切换到PostgreSQL

Phase 5: 清理优化(1天)

  1. ✅ 移除MongoDB依赖
  2. ✅ 删除MongoDB配置
  3. ✅ 性能调优
  4. ✅ 监控告警配置

⚠️ 风险评估

风险项影响等级应对措施
写入性能下降🟡 中异步批量写入 + 调整PG参数
查询性能下降🟢 低分区表 + 合理索引 + SQL优化
分区管理复杂🟡 中自动化脚本 + 定期检查
大日志存储🟢 低TEXT字段无大小限制 + 压缩
事务开销🟢 低异步写入 + 批量提交
数据迁移🔴 高不迁移历史数据,只写新数据

🎯 预期收益

  1. 成本降低:减少MongoDB服务器维护成本
  2. 架构简化:统一使用PostgreSQL,减少技术栈
  3. 查询能力:支持复杂SQL查询和聚合分析
  4. 扩展性:分区表支持无限扩展
  5. 可靠性:PostgreSQL事务保证数据一致性

📚 参考资料

Copyright © 2025-present | 网站备案号:豫ICP备19038229号-1