postgresでserial型のキーが溢れてしまった

2015-02-14 / PostgreSQL, DataBase

稼働から1年くらいで、serial型にしていたキーの値が上限に達してしまった。

データの増加量の見積もりが甘かったのが原因で、キーの型をbigintにして解決した。

再現

PostgreSQLのserial型はテーブル作成時に同時にsequenceを作ってくれる。

test=# create table sample(
test(#   id serial,
test(#   number integer,
test(#   created_at time default current_timestamp
test(# );
test=# \d
                リレーションの一覧
 スキーマ |     名前      |     型     |  所有者  
----------+---------------+------------+----------
 public   | sample        | テーブル   | postgres
 public   | sample_id_seq | シーケンス | postgres
(2 行)

sequenceを利用するテーブル(sample)では、実際のデータ型はintになる。一方、sequence(sample_id_seq)の型はbigintになっている。

test=# \d sample*
                                 テーブル "public.sample"
     列     |           型           |                       修飾語                        
------------+------------------------+-----------------------------------------------------
 id         | integer                | not null default nextval('sample_id_seq'::regclass)
 number     | integer                | 
 created_at | time without time zone | default now()

       シーケンス "public.sample_id_seq"
      列       |   型    |         値          
---------------+---------+---------------------
 sequence_name | name    | sample_id_seq
 last_value    | bigint  | 1
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
所有者: public.sample.id

適当にインサートする。

test=# insert into sample(number) values(100);
INSERT 0 1
test=# insert into sample(number) values(200);
INSERT 0 1
test=# insert into sample(number) values(300);
INSERT 0 1
test=# select * from sample;
 id | number |   created_at    
----+--------+-----------------
  1 |    100 | 05:58:37.695097
  2 |    200 | 05:58:41.566133
  3 |    300 | 05:58:44.077084
(3 行)

test=# \d sample_id_seq 
       シーケンス "public.sample_id_seq"
      列       |   型    |         値          
---------------+---------+---------------------
 sequence_name | name    | sample_id_seq
 last_value    | bigint  | 3
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 30
 is_cycled     | boolean | f
 is_called     | boolean | t
所有者: public.sample.id

last_valueが3になっている。

とりあえずintの上限値近くまで現在値を増やしてみる。

test=# select setval('sample_id_seq', 2147483645, false);
   setval   
------------
 2147483645
(1 行)

test=# \d sample_id_seq 
       シーケンス "public.sample_id_seq"
      列       |   型    |         値          
---------------+---------+---------------------
 sequence_name | name    | sample_id_seq
 last_value    | bigint  | 2147483645
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
所有者: public.sample.id

last_valueが2147483645になった。あと数回でintの上限に達する。

もう一度、適当にインサートする。

test=# insert into sample(number) values(400);
INSERT 0 1
test=# insert into sample(number) values(500);
INSERT 0 1
test=# insert into sample(number) values(600);
INSERT 0 1
test=# insert into sample(number) values(700);
ERROR:  integerの範囲外です
test=# select * from sample;
     id     | number |   created_at    
------------+--------+-----------------
          1 |    100 | 05:58:37.695097
          2 |    200 | 05:58:41.566133
          3 |    300 | 05:58:44.077084
 2147483645 |    400 | 06:06:09.040102
 2147483646 |    500 | 06:06:13.327061
 2147483647 |    600 | 06:06:17.888157
(6 行)

sampleのidにintの上限値を超えた値(2147483648)を入れようとしたので、残念な結果になった。

sample_id_seqは、sampleのインサートが成功したかどうかに関係なく、値は増加していく。あくまでもbigintなので、使う方の都合なんて関係なく、intの上限を超えてくる。

test=# insert into sample(number) values(800);
ERROR:  integerの範囲外です
test=# insert into sample(number) values(900);
ERROR:  integerの範囲外です
test=# insert into sample(number) values(1000);
ERROR:  integerの範囲外です
test=# \d sample_id_seq 
       シーケンス "public.sample_id_seq"
      列       |   型    |         値          
---------------+---------+---------------------
 sequence_name | name    | sample_id_seq
 last_value    | bigint  | 2147483651
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 30
 is_cycled     | boolean | f
 is_called     | boolean | t
所有者: public.sample.id

なので、sampleの方でidをbigintに変更するしかない。

test=# alter table sample alter id type bigint;
ALTER TABLE
test=# \d sample
                                 テーブル "public.sample"
     列     |           型           |                       修飾語                        
------------+------------------------+-----------------------------------------------------
 id         | bigint                 | not null default nextval('sample_id_seq'::regclass)
 number     | integer                | 
 created_at | time without time zone | default now()

ちゃんとカラムがbigintになっている。

もう一度インサートしてみる。

test=# insert into sample(number) values(1100);
INSERT 0 1
test=# insert into sample(number) values(1200);
INSERT 0 1
test=# insert into sample(number) values(1300);
INSERT 0 1
test=# select * from sample;
     id     | number |   created_at    
------------+--------+-----------------
          1 |    100 | 05:58:37.695097
          2 |    200 | 05:58:41.566133
          3 |    300 | 05:58:44.077084
 2147483645 |    400 | 06:06:09.040102
 2147483646 |    500 | 06:06:13.327061
 2147483647 |    600 | 06:06:17.888157
 2147483652 |   1100 | 06:17:54.635158
 2147483653 |   1200 | 06:17:57.313085
 2147483654 |   1300 | 06:17:59.722082
(9 行)

インサートできた。

作成時

初めからbigintにしたければ、bigserialを使う。

test=# create table sample2(
test(#   id bigserial,
test(#   number integer,
test(#   created_at time default current_timestamp
test(# );
CREATE TABLE
est=# \d sample2*
                                 テーブル "public.sample2"
     列     |           型           |                        修飾語                        
------------+------------------------+------------------------------------------------------
 id         | bigint                 | not null default nextval('sample2_id_seq'::regclass)
 number     | integer                | 
 created_at | time without time zone | default now()

      シーケンス "public.sample2_id_seq"
      列       |   型    |         値          
---------------+---------+---------------------
 sequence_name | name    | sample2_id_seq
 last_value    | bigint  | 1
 start_value   | bigint  | 1
 increment_by  | bigint  | 1
 max_value     | bigint  | 9223372036854775807
 min_value     | bigint  | 1
 cache_value   | bigint  | 1
 log_cnt       | bigint  | 0
 is_cycled     | boolean | f
 is_called     | boolean | f
所有者: public.sample2.id

あと何年運用できる？

bigintに変更したのは良いんだけど、また上限に達したらどうするのか？

約1年の運用でintの上限値に達したので、同じペースでデータが増え続けると、

9223372036854775807 / 2147483647 = 4294967298

ということで、4294967298年運用できる。

仮に今年から100万倍のデータ量になっても、

9223372036854775807 / 2147483647000000 = 4294.967298

4294年運用できるらしい。

上限に達する前に、システムリプレースしてそうだし、現実的には大丈夫そうだ。