Monday, April 6, 2009

Performance Problems

I am having MySQL performance problems with my current schema. Here is a rundown of what the inserter does:

- get's the filename, filesize and hash for a file
- checks the objects database to see if that combination already exists
- if it exists, skip to next step
- if it does not exisit, insert it
- insert the information in the file table

Here is the create table definition for the tables:

CREATE TABLE `objects` (
`objectid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`filename` varchar(256) COLLATE latin1_bin DEFAULT NULL,
`filesize` bigint(20) unsigned DEFAULT NULL,
`hash` varchar(32) COLLATE latin1_bin DEFAULT NULL,
PRIMARY KEY (`objectid`),
UNIQUE KEY `nsh` (`filename`,`filesize`,`hash`)
) ENGINE=MyISAM AUTO_INCREMENT=7037849 DEFAULT CHARSET=latin1 COLLATE=latin1_bin

CREATE TABLE `files` (
`path` varchar(4096) COLLATE latin1_bin DEFAULT NULL,
`filename` varchar(256) COLLATE latin1_bin DEFAULT NULL,
`filesize` bigint(20) unsigned DEFAULT NULL,
`hash` varchar(32) COLLATE latin1_bin DEFAULT NULL,
`backuptime` datetime DEFAULT NULL,
`status` enum('Active','Inactive','Deleted') COLLATE latin1_bin DEFAULT NULL,
`objectid` bigint(20) unsigned DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_bin

I currently have a files table per backup client node. On my test system, I can run the inserts sequentially for 15 nodes (37 or so processes for multiple filesystem clients), and it runs in about 50 minutes. When i run them at the same time, it runs many times longer. It looks like it is the objects table, since I tried "insert delayed" and that cut the time back down to around the sequential time. The downside to the delayed, is that a crash can lose data.

Does anyone have any ideas as to what I am doing wrong? If it matters, I am running a dual core AMD 5800+, 4GB memory, and a 4 disk raid-0 SSD Array.

Thanks for any input anyone may have.

No comments: