#ceph IRC Log

Index

IRC Log for 2010-09-03

Timestamps are in GMT/BST.

[0:02] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) Quit (Quit: Osso)
[0:05] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) has joined #ceph
[0:05] * Osso_ (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) has joined #ceph
[0:05] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) Quit (Read error: Connection reset by peer)
[0:05] * Osso_ is now known as Osso
[0:16] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[1:01] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) has joined #ceph
[1:15] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) Quit (Quit: Leaving.)
[1:15] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) has joined #ceph
[1:27] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[2:09] * ezgreg (~Greg@78.155.152.6) Quit (Read error: Connection reset by peer)
[2:09] * ezgreg (~Greg@78.155.152.6) has joined #ceph
[2:39] * bbigras (quasselcor@bas11-montreal02-1128531099.dsl.bell.ca) has joined #ceph
[2:39] * bbigras is now known as Guest469
[3:09] * kblin_ (~kai@h1467546.stratoserver.net) has joined #ceph
[3:10] * kblin (~kai@h1467546.stratoserver.net) Quit (Read error: Connection reset by peer)
[3:38] * MK_FG (~fraggod@188.226.51.71) Quit (Remote host closed the connection)
[4:23] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) Quit (Quit: Osso)
[5:07] * MK_FG (~fraggod@wall.mplik.ru) has joined #ceph
[5:23] * lidongyang (~lidongyan@222.126.194.154) Quit (Read error: Connection reset by peer)
[5:27] * lidongyang (~lidongyan@222.126.194.154) has joined #ceph
[6:50] * f4m8_ is now known as f4m8
[8:18] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[8:40] * f4m8 (~drehmomen@lug-owl.de) Quit (Quit: leaving)
[8:40] * f4m8 (~f4m8@lug-owl.de) has joined #ceph
[9:14] * allsystemsarego (~allsystem@188.25.128.208) has joined #ceph
[9:19] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)
[10:02] * Yoric (~David@213.144.210.93) has joined #ceph
[10:05] * sentinel_ (~sentinel_@188.226.51.71) has joined #ceph
[10:06] * sentinel_ (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[10:07] * MK_FG (~fraggod@wall.mplik.ru) has left #ceph
[10:28] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[10:29] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[10:35] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[10:40] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[10:40] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[10:41] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[10:56] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[10:57] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[11:02] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) has joined #ceph
[11:14] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[11:16] * MK_FG (~fraggod@wall.mplik.ru) has joined #ceph
[12:06] * f4m8 (~f4m8@lug-owl.de) Quit (Quit: Lost terminal)
[12:06] * f4m8 (~f4m8@lug-owl.de) has joined #ceph
[12:44] * MK_FG (~fraggod@wall.mplik.ru) Quit (Remote host closed the connection)
[13:32] * nolan (~nolan@phong.sigbus.net) Quit (synthon.oftc.net resistance.oftc.net)
[13:32] * conner (~conner@leo.tuc.noao.edu) Quit (synthon.oftc.net resistance.oftc.net)
[13:33] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[13:33] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[14:07] * Osso (osso@AMontsouris-755-1-6-62.w86-212.abo.wanadoo.fr) Quit (Quit: Osso)
[14:14] * conner (~conner@leo.tuc.noao.edu) Quit (synthon.oftc.net resistance.oftc.net)
[14:14] * nolan (~nolan@phong.sigbus.net) Quit (synthon.oftc.net resistance.oftc.net)
[14:15] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[14:15] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[14:48] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[14:49] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[14:58] * MK_FG (~fraggod@188.226.51.71) has joined #ceph
[15:24] * conner (~conner@leo.tuc.noao.edu) Quit (reticulum.oftc.net resistance.oftc.net)
[15:24] * nolan (~nolan@phong.sigbus.net) Quit (reticulum.oftc.net resistance.oftc.net)
[15:25] * nolan (~nolan@phong.sigbus.net) has joined #ceph
[15:25] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[15:26] * f4m8 (~f4m8@lug-owl.de) Quit (Quit: Lost terminal)
[15:28] * f4m8 (~f4m8@lug-owl.de) has joined #ceph
[15:29] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) has joined #ceph
[15:30] * f4m8 (~f4m8@lug-owl.de) Quit ()
[15:31] * monrad is now known as monrad-51468
[15:31] * f4m8 (~f4m8@lug-owl.de) has joined #ceph
[15:40] * sagelap (~sage@c-24-218-65-120.hsd1.ma.comcast.net) Quit (Ping timeout: 480 seconds)
[16:32] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) Quit (Ping timeout: 480 seconds)
[17:12] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[17:14] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[17:28] * sentinel_x73 (~sentinel_@188.226.51.71) Quit (Remote host closed the connection)
[17:29] * sentinel_x73 (~sentinel_@188.226.51.71) has joined #ceph
[18:01] * tjikkun (~tjikkun@2001:7b8:356:0:204:bff:fe80:8080) has joined #ceph
[18:41] * Yoric (~David@213.144.210.93) Quit (Quit: Yoric)
[18:50] * conner (~conner@leo.tuc.noao.edu) Quit (Ping timeout: 480 seconds)
[18:58] * conner (~conner@leo.tuc.noao.edu) has joined #ceph
[20:17] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) has joined #ceph
[20:20] * klooie (~kloo@a82-92-246-211.adsl.xs4all.nl) has joined #ceph
[20:20] <klooie> hi.
[20:37] <wido> klooie: hi
[20:41] <wido> yehudasa: virsh snapshot-create works fine, does like it's intended to do. One down side, it snaps all the disks attached, you can't control which disk to snap and which not.
[20:41] <yehudasa> wido: great
[20:41] <wido> and if for example, you attach a disk to two VM's and run GFS/OCFS(2) inside the VM's, you can't snapshot those disks either
[20:42] <yehudasa> yeah, that's a problem
[20:42] <yehudasa> we plan to fix it sometime, it's just not such an easy task
[20:43] <wido> oh, no, it won't, but something for later :)
[20:51] <klooie> hi, people! :)
[20:52] <klooie> i still have the problem that when i write files > 4gb to ceph, they end up corrupt and 'ceph -w' shows no increasing amount of data stored beyond the initial 4GB.
[20:53] <klooie> my initial ceph installation was on an ill-maintained gentoo image, so i installed debian from scratch to rule out any weirdness particular to that.
[20:53] <klooie> but the behaviour is the same.
[20:54] <klooie> i'm running on a 32-bit system, which i understood from the mailing list is not tested as thoroughly?
[20:55] <yehudasa> klooie: are you writing large files that are greater than 4GB, or the total sum of files size is larger than 4GB?
[20:55] <klooie> individual files larger than 4GB.
[20:56] <yehudasa> are you able to write such large files with other filesystems on your client?
[20:56] <klooie> the files show on ceph as having the full size, but they're corrupt; checksum differs from the original.
[20:56] <yehudasa> I see
[20:56] <yehudasa> what version are you using?
[20:56] <klooie> yes i'm copying from an ext3 filesystem mounted on the same server as the ceph filesystem.
[20:57] <klooie> 0.21.1.
[20:57] <yehudasa> wido: were you able to use > 4GB files?
[20:57] <yehudasa> could be some 32 bit issue too, but probably not related to the one on the list today
[20:58] <klooie> i'm quite new to ceph overall - i've also been looking at the pg dump while copying such a file and beyond 4GB nothing changes anymore.
[20:58] <klooie> it's as if no new objects are being allocated?
[20:59] <yehudasa> sounds like a problem with offsets being truncated somewhere
[20:59] <klooie> in my naivite i would almost think a file offset is wrapping around.
[20:59] <klooie> aye. :)
[21:00] <klooie> i've started to reading the code (after browsing some of the papers) but it's not accessible to me yet.
[21:00] <yehudasa> do you have a 64 bit client to verify?
[21:00] <klooie> unfortunately no.
[21:00] <klooie> this is my old home kit, but i may get one.
[21:01] <klooie> is there anything i might do to narrow it down?
[21:02] <yehudasa> can you turn on logs on both client and osds
[21:02] <yehudasa> the osds should have 'debug ms = 1'
[21:03] <yehudasa> and do something like 'dd if=/dev/zero of=/mnt/foo bs=1024 count=1 seek=$((4*1024*1024))'
[21:03] <klooie> ok.
[21:04] <wido> yehudasa: i've writted over 4GB files, but never verified them
[21:04] <wido> right now my Ceph is still down due to a MDS bug, so i'm only using RBD/RADOS at the moment
[21:04] <yehudasa> for client logs you'd need to have dynamic debug compiled in, and set something like "echo 'module ceph +p' > /sys/kernel/debug/dynamic_debug/control"
[21:04] <yehudasa> oh
[21:04] <klooie> they write at a normal speed also > 4GB, so the problem is not apparent.
[21:05] <klooie> hmm, though that's probably because i'm using iscsi and the network bandwidth is the limiting factor.
[21:05] <yehudasa> you can look at the write requests going on the osd log, see the offset written there
[21:33] <klooie> it wasn't really clear to me (in my innocence) what was happening with the seek and single byte writes..
[21:33] <klooie> now i'm copying an actual big file and i see it doing full writes to different oids.
[21:33] <klooie> if it wraps around, that should show in a bit?
[21:33] <yehudasa> it's not a single byte write, it's 1024 bytes write
[21:34] <klooie> sorry yes, i did do bs=1024 (and everything else).
[21:34] <yehudasa> but maybe it's better doing 'dd if=/dev/zero of=/mnt/foo bs=4096 count=1 seek=$((1024*1024))' if you're using rbd
[21:34] <yehudasa> so that it doesn't read the block first
[21:35] <klooie> but i seemed to get only ops from the mds, now i'm getting ops from a client that fill up one object after another.
[21:35] <yehudasa> oh.. you need to do sync after that I guess
[21:35] <yehudasa> and I take it that you're running the filesystem and not rbd
[21:35] <klooie> yes, indeed.
[21:39] <klooie> the file copy is slowly approaching the 4gb mark, then i'll go back to dd.
[21:43] <wido> yehudasa & klooie: Wouldn't downloading a Debian ISO make a good test?
[21:43] <wido> it's around 4.4GB
[21:46] <yehudasa> hmm.. I'd rather just see a single write beyond the 4GB mark..
[21:48] <klooie> beyond 4GB it goes back to the same oids that it wrote to at the start.
[21:52] <klooie> doing sync after each dd does the trick.
[21:53] <klooie> would you like to see, yehudasa, how/where can i best show you?
[22:04] <yehudasa> pastebin?
[22:13] <klooie> voila, http://pastebin.com/JQ8Cxez8
[22:15] * allsystemsarego (~allsystem@188.25.128.208) Quit (Quit: Leaving)
[22:16] <yehudasa> checking it out
[22:19] <yehudasa> yeah, I guess the problem is that size_t and off_t are probably defined as 32 bit
[22:24] <klooie> sizeof(off_t) == sizeof(size_t) == 4, in a quick dummy C program.
[22:24] <klooie> but ext3 isn't bothered, somehow?
[22:24] <yehudasa> yeah, verifying what we need to do
[22:30] <yehudasa> can you get the client log too?
[22:32] <klooie> unfortunately, i have to go. :(
[22:32] <klooie> but i will gladly provide it later.
[22:33] <yehudasa> ok, I'll look at it, thanks
[22:33] <yehudasa> I'll also open a bug report
[22:33] <klooie> thank you, i think ceph's great.
[22:33] <yehudasa> thanks!
[22:34] <klooie> bbl!
[22:34] * klooie (~kloo@a82-92-246-211.adsl.xs4all.nl) Quit (Quit: ...)
[22:40] * Yoric (~David@dau94-10-88-189-211-192.fbx.proxad.net) Quit (Quit: Yoric)

These logs were automatically created by CephLogBot on irc.oftc.net using the Java IRC LogBot.