技术频道

公众号推荐

微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦！

PostgreSQL数据库 OLTP高并发请求性能优化

时间：2020-05-16分类：Postgre SQL作者：编程之家

在多核系统中，一般TPS会随并发数的增加而提升，但是当并发数超过一定的数值（如cpu核数的2到3倍以后），性能开始下降，并发数越高，下降越严重。

例子：

更新500万记录表中的1条随机记录。开8000个并发。


  
  
   
   create table test_8000 (id int primary key, cnt  default 0);
  
  
  
  
   
   insert into test_8000 select generate_series(15000000);
  
  
  
  
   
    
  
  
  
  
   
   vi t.sql
  
  
  
  
   
   \setrandom id 1@H_404_81@5000000
  
  
  
  
   
   updateset cnt=cnt+where id=:id;
  
  
  
  
   
   update test_8000 2;

每次加载80个并发，循环100次，一共加载8000个并发。

vi testsh

@H_404_135@#!/bin/bash

@H_404_135@for ((i=0;i<100;i++))

@H_404_135@do

sleep pgbench -M simple n r f ./tsql c 80 j T 100000U postgres &

done

开始

testsh

当连接数达到8000后，观察TPS，我们可以使用PG的统计信息表来计算QPS。

postgres=@H_404_135@# select count(*) from pg_stat_activity;

@H_404_135@count

@H_404_135@-------

@H_404_135@ 8002

@H_404_135@(1 row)

# select timestamptz '2015-10-08 17:01:24.203089+08' - timestamptz '2015-10-08 17:01:16.574076+08';

@H_404_135@ ?column?

@H_404_135@-----------------

@H_404_135@00:00:07.629013

@H_404_135@# select 43819090-43749480;

@H_404_135@?column?

@H_404_135@----------

@H_404_135@ 69610

@H_404_135@# select 69610/07.629013;

@H_404_135@ ?column?

@H_404_135@-----------------------

@H_404 [email protected]

@H_404_135@(1 row)

8000个并发的时候，更新TPS约9124。大部分时间可能浪费在cpu调度上了。

另一种场景，

如果有8000个并发是空闲连接，只有10个在执行更新，性能是这样的：

先制造8000个空闲连接：

select pg_sleep100000done

然后开启10个连接执行更新操作。

M prepared P 10@H_404_81@1000U postgres postgres

progress:@H_404 [email protected] s29429.2 tps lat 0.336 ms stddev 0.109

progress2.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

3.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

4.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

6.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

7.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

8.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

9.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

10.0@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]@H_404 [email protected]

这种方法的性能约6万 qps。

优化思路：

排队处理用户请求。类似pgbouncer或Oracle的shared server机制，真实处理请求的进程数有限。

使用Postgresql的advisory函数可以模拟这种排队机制：

create or replace function updl intv_id ) returns voidas $$

declare

begin

LOOP

if pg_try_advisory_xact_locklthen--只有获得这个应用级锁才执行更新，否则就等待。

update test_8000 v_id returnelse

perform pg_sleep30*random());随机等待时间

endifEND LOOP;

end$$ language plpgsql strict 增加一个随机变量l，用来表示应用所的号码，也就是说模拟10个同时在更新的操作，其他的都在等待。

这个是没有经过优化的排队机制，因为不是独立的进程处理用户请求，依旧是backend process在处理用户请求，依旧有8000个进程。

5000000

\setrandom l 10

select(:sh

@H_404_135@#!/bin/bash

@H_404_135@for ((i=0;i<100;i++))

@H_404_135@do

;

done

测试结果比较理想，已经提升了1倍性能。

# select Now(),n_tup_upd+n_tup_hot_upd from pg_stat_all_tables where relname='test_8000'; Now |?column -------------------------------+----------- @H_404_81@2015-1008@H_404_81@19:0637.951332221045069 row)

------------------------------+----------- @H_404_81@0746.46325222879057 # select timestamptz '2015-10-08 19:07:46.46325+08' - timestamptz '2015-10-08 19:06:37.951332+08'; ----------------- @H_404_81@000108.511918 # select 222879057-221045069; ---------- @H_404_81@1833988 # select 1833988/68.5; -------------------- @H_404 [email protected] )

模拟结果，相比不排队，有1倍以上的性能提升。

TOP

top 0937 up 119 days359 users load average0.960.981.01

Tasks8872 total 5 running8866 sleeping stopped0 zombie