Issue #257: Missing DB-Indexes (conservancy/kallithea)

Thomas Güttler issues-reply at bitbucket.org
Tue Dec 6 09:51:28 UTC 2016


New issue 257: Missing DB-Indexes
https://bitbucket.org/conservancy/kallithea/issues/257/missing-db-indexes

Thomas Güttler:

I used this SQL query to check if there are missing indexes:


```
#!sql

kallithea=> SELECT relname, seq_scan-idx_scan AS too_much_seq, case when seq_scan-idx_scan>0 THEN 'Missing Index?' ELSE 'OK' END, pg_relation_size(relname::regclass) AS rel_size, seq_scan, idx_scan
kallithea->  FROM pg_stat_all_tables
kallithea->  WHERE schemaname='public' AND pg_relation_size(relname::regclass)>80000 ORDER BY too_much_seq DESC;

```
Result


```
#!sql


   relname    | too_much_seq |      case      | rel_size | seq_scan | idx_scan 
--------------+--------------+----------------+----------+----------+----------
 user_logs    |          638 | Missing Index? | 98467840 |      638 |        0
 repositories |      -864284 | OK             |    90112 |    84041 |   948325

```

There are a lot of rows in this table:

```
#!sql

kallithea=> select count(*) from user_logs;
  count  
---------
 1051155

```
I guess this is not the root of our performance issue, but maybe this could be optimized nevertheless.

This is the table schema:

```
#!sql

kallithea=> \d user_logs
                                             Tabelle »public.user_logs«
     Spalte      |             Typ             |                              Attribute                              
-----------------+-----------------------------+---------------------------------------------------------------------
 user_log_id     | integer                     | not null Vorgabewert nextval('user_logs_user_log_id_seq'::regclass)
 user_id         | integer                     | 
 username        | character varying(255)      | 
 repository_id   | integer                     | 
 repository_name | character varying(255)      | 
 user_ip         | character varying(255)      | 
 action          | text                        | 
 action_date     | timestamp without time zone | 
Indexe:
    "user_logs_pkey" PRIMARY KEY, btree (user_log_id)
Fremdschlüssel-Constraints:
    "user_logs_repository_id_fkey" FOREIGN KEY (repository_id) REFERENCES repositories(repo_id)
    "user_logs_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(user_id)

```

I see two solutions:

1. cut this table from time to time

2. use an index for this table.

Unfortunately I don't know which column (or combined columns) would need an index. 

The source of this PostgreSQL Index Usage Analysis is here: http://stackoverflow.com/questions/3318727/postgresql-index-usage-analysis




More information about the kallithea-general mailing list