Created
May 2, 2011 09:28
-
-
Save thesjg/951362 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:10] i think we should commit an optimization for that phoronix threaded i/o tester | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:10] that just ignores about 80% of the i/o's | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:10] :-) | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:11] we could certainly optimize buffers containing all-zeros (if we don't already). heh | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:11] that might be useful | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:11] they are probably all zero's | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:11] i looked at what it did | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:11] it just allocated a chunk ala malloc() and then wrote that allocated buffers contents to a file | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:12] it never touched the buffer | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:13] how do you know if its all zero's, in an efficient fashion? | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:13] i guess you compute a crc for the data, you are touching it all anyway, you could just not actually store it | |
| <me_!U2FsdGVkX1@batman.acm.jhu.edu> [03:14] freebsd's pagezero() on i686 actually finds the first nonzero index in a page and starts zeroing from there :d | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:16] it would be easy for the crc32 code to have another argument which returns the zero/non-zero state of the buffer | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:16] well, I guess it would be faster just to scan the buffer twice anyway | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:17] well, whats the crc32 of a zero-filled buffer? | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:17] you could just conditionalize on it | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:18] you'd still have to check that the buffer contains all-zeros, but yes that could be a first-order approximation to avoid the zero-check if it doesn't match | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:18] would it be hard to make hammer smart enough to not lay the zero's down on disk? | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:18] no, it would be trivial. | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:18] yeah thats what i figured | |
| <me_!U2FsdGVkX1@batman.acm.jhu.edu> [03:18] stop encouraging each other ! :) | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:19] you'd just have a data record with a data_offset of 0 as a special case meaning 'buffer full of zeros' | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:19] it might already be coded or partially coded. it would be very easy to implement | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:23] dillon: hammer_object.c somewhere? | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:24] mmmm. I'd have to look. hold on a sec | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:24] for writes, hammer_ip_add_bulk() | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:25] check for all zeros, and don't allocate a data offset (set record->leaf.data_offset to 0) . well, there might be a learning experience there, those code paths are complex but the actual mod isn't going to be very complex | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:26] ok thats what i thought, i was looking at the dedup hooks | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:26] then on the read-back side checking for a data_offset of 0 and zero-filling the read buffer | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:26] instead of doing a direct data read | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:26] read side just allocates and return()'s a buffer? | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:26] or similar? | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:27] read side is accessing hammer via its b-tree and filling in buffer cache buffers for the related file | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:27] so in the all-zeros case it would check that the b-tree record has no data offset (implied all-zeros) and bzero()'s the buffer cache buffer | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:28] I think you could do it fairly easily inside a vkernel for testing | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:28] dillon: yeah i'll try that | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:28] conditionalize the code on hammer filesystem version 6 (the current WIP version) so it doesn't execute on your root filesystem | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:28] then use a small hammer partition formatted w/ hammer and upgraded to version 6 for testing. or inside a vkernel | |
| <dillon!~dillon@leaf.dragonflybsd.org> [03:31] thesjg: another thing that can be done is to check for the append-all-zeros case. In that case no new record needs to be written at all, the file size is simply adjusted and it is a hole | |
| <thesjg!~sjgirc@67-54-133-153.cust.wildblue.net> [03:31] dillon: ahh, yeah, good call, i'll note that too |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment