Displaying #cassandra-dev/2016-10-04.log:

Tue Oct 4 00:15:55 2016  mstepura:Joined the channel
Tue Oct 4 00:19:57 2016  stef1927:thanks jeffj
Tue Oct 4 01:04:11 2016  sood:Joined the channel
Tue Oct 4 01:19:19 2016  kohlisankalp:Joined the channel
Tue Oct 4 01:35:17 2016  sood:Joined the channel
Tue Oct 4 02:12:44 2016  sood:Joined the channel
Tue Oct 4 02:14:48 2016  mstepura:Joined the channel
Tue Oct 4 02:27:15 2016  sood:Joined the channel
Tue Oct 4 02:33:06 2016  nickmbailey:Joined the channel
Tue Oct 4 03:00:21 2016  kohlisankalp:Joined the channel
Tue Oct 4 03:15:16 2016  clohfink:Joined the channel
Tue Oct 4 03:17:55 2016  mstepura:Joined the channel
Tue Oct 4 03:28:15 2016  kohlisankalp:Joined the channel
Tue Oct 4 03:39:16 2016  mstepura:Joined the channel
Tue Oct 4 03:59:26 2016  kohlisankalp:Joined the channel
Tue Oct 4 04:26:44 2016  clohfink:Joined the channel
Tue Oct 4 04:29:14 2016  nickmbailey:Joined the channel
Tue Oct 4 04:35:07 2016  mstepura:Joined the channel
Tue Oct 4 04:46:14 2016  sood:Joined the channel
Tue Oct 4 05:08:10 2016  clohfink:Joined the channel
Tue Oct 4 05:22:45 2016  cosql:Joined the channel
Tue Oct 4 05:29:06 2016  clohfink:Joined the channel
Tue Oct 4 05:43:41 2016  mstepura:Joined the channel
Tue Oct 4 05:55:13 2016  mstepura:Joined the channel
Tue Oct 4 06:00:13 2016  nickmbailey:Joined the channel
Tue Oct 4 06:07:09 2016  mstepura:Joined the channel
Tue Oct 4 07:09:03 2016  gila:Joined the channel
Tue Oct 4 07:28:33 2016  cosql:Joined the channel
Tue Oct 4 07:39:41 2016  kvaster:Joined the channel
Tue Oct 4 07:52:47 2016  lqid:Joined the channel
Tue Oct 4 08:03:22 2016  nickmbailey:Joined the channel
Tue Oct 4 08:22:15 2016  kurtG:Joined the channel
Tue Oct 4 09:04:11 2016  nickmbailey:Joined the channel
Tue Oct 4 09:21:23 2016  kvaster:Joined the channel
Tue Oct 4 09:42:05 2016  kvaster:Joined the channel
Tue Oct 4 10:04:48 2016  nickmbailey:Joined the channel
Tue Oct 4 10:35:01 2016  clohfink:Joined the channel
Tue Oct 4 10:43:04 2016  lqid:Joined the channel
Tue Oct 4 11:05:39 2016  nickmbailey:Joined the channel
Tue Oct 4 11:47:16 2016  clohfink:Joined the channel
Tue Oct 4 11:59:23 2016  cassie_noob:Joined the channel
Tue Oct 4 11:59:37 2016  cassie_noob:hi
Tue Oct 4 12:06:23 2016  nickmbailey:Joined the channel
Tue Oct 4 12:10:48 2016  spodkowinski:Joined the channel
Tue Oct 4 12:16:12 2016  lqid_:Joined the channel
Tue Oct 4 12:28:27 2016  clohfink:Joined the channel
Tue Oct 4 12:39:38 2016  nickmbailey:Joined the channel
Tue Oct 4 12:50:52 2016  sood:Joined the channel
Tue Oct 4 12:52:06 2016  clohfink:Joined the channel
Tue Oct 4 12:54:39 2016  clohfink:Joined the channel
Tue Oct 4 13:14:52 2016  nickmbailey:Joined the channel
Tue Oct 4 13:17:16 2016  nickmbailey:Joined the channel
Tue Oct 4 13:28:02 2016  sood:Joined the channel
Tue Oct 4 13:31:31 2016  nickmbailey:Joined the channel
Tue Oct 4 13:36:21 2016  nickmbailey:Joined the channel
Tue Oct 4 14:03:10 2016  nickmbailey:Joined the channel
Tue Oct 4 14:20:54 2016  kohlisankalp:Joined the channel
Tue Oct 4 14:29:47 2016  sood:Joined the channel
Tue Oct 4 14:33:07 2016  tolbertam:Joined the channel
Tue Oct 4 15:19:35 2016  clohfink:Joined the channel
Tue Oct 4 15:24:01 2016  nickmbailey:Joined the channel
Tue Oct 4 15:39:22 2016  mpenick:Joined the channel
Tue Oct 4 15:44:52 2016  mpenick:Joined the channel
Tue Oct 4 15:56:57 2016  sood:Joined the channel
Tue Oct 4 16:10:57 2016  thobbs:Joined the channel
Tue Oct 4 16:15:32 2016  mstepura:Joined the channel
Tue Oct 4 16:15:59 2016  kohlisankalp:Joined the channel
Tue Oct 4 16:24:57 2016  kohlisankalp:Joined the channel
Tue Oct 4 16:40:33 2016  kohlisankalp:Joined the channel
Tue Oct 4 16:47:08 2016  mstepura:Joined the channel
Tue Oct 4 16:53:43 2016  jmckenzie:Joined the channel
Tue Oct 4 17:01:13 2016  kohlisankalp:Joined the channel
Tue Oct 4 17:15:11 2016  nickmbailey:Joined the channel
Tue Oct 4 17:22:28 2016  mpenick:Joined the channel
Tue Oct 4 18:11:41 2016  nickmbailey:Joined the channel
Tue Oct 4 18:21:10 2016  kohlisankalp:Joined the channel
Tue Oct 4 18:22:20 2016  nickmbai_:Joined the channel
Tue Oct 4 18:23:21 2016  clohfink:Joined the channel
Tue Oct 4 18:30:04 2016  nickmbailey:Joined the channel
Tue Oct 4 18:34:47 2016  kohlisankalp:Joined the channel
Tue Oct 4 18:34:56 2016  clohfink:Joined the channel
Tue Oct 4 18:37:22 2016  kohlisan_:Joined the channel
Tue Oct 4 18:38:22 2016  kohlisankalp:Joined the channel
Tue Oct 4 18:47:19 2016  kohlisankalp:Joined the channel
Tue Oct 4 18:50:25 2016  jmckenzie:Joined the channel
Tue Oct 4 18:52:36 2016  clohfink:Joined the channel
Tue Oct 4 18:59:32 2016  mstepura:Joined the channel
Tue Oct 4 19:23:17 2016  nickmbailey:Joined the channel
Tue Oct 4 19:29:06 2016  clohfink:Joined the channel
Tue Oct 4 19:40:19 2016  cosql:Joined the channel
Tue Oct 4 19:48:44 2016  thobbs:Joined the channel
Tue Oct 4 19:58:48 2016  kohlisankalp:Joined the channel
Tue Oct 4 20:39:14 2016  kurtG:Joined the channel
Tue Oct 4 20:49:15 2016  mstepura:Joined the channel
Tue Oct 4 21:10:44 2016  kohlisankalp:Joined the channel
Tue Oct 4 21:56:39 2016  mkjellman:Joined the channel
Tue Oct 4 21:57:35 2016  mkjellman:sooooooo... SSTableReader#tidy(), SSTableReader#DropPageCache, Ref.GlobalState.release, and CLibrary.trySkipCache(
Tue Oct 4 21:57:50 2016  mkjellman:SSTableReader#tidy() does // don't ideally want to dropPageCache for the file until all instances have been released
Tue Oct 4 21:57:50 2016  mkjellman:CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0);
Tue Oct 4 21:57:51 2016  mkjellman:CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0);
Tue Oct 4 21:57:55 2016  mkjellman:CLibrary.trySkipCache calls posix_fadvise with POSIX_FADV_DONTNEED which according to the documentation is “The specified data will not be accessed in the near future.”
Tue Oct 4 21:57:59 2016  mkjellman:posix_fadvise “Programs can use posix_fadvise() to announce an intention to access file data in a specific pattern in the future, thus allowing the kernel to perform appropriate optimizations.”
Tue Oct 4 21:58:03 2016  mkjellman:In kernel versions < 2.6.6 if len was 0 it meant 0 bytes… but now it means all bytes from the offset to the end of the file… we used to run 2.6 kernels and now we are on 3.8 so that’s totally different behavior here…
Tue Oct 4 21:58:35 2016  mkjellman:So it's hard to understand what the code is doing here.. as of trunk there is no additional check other than seeing if the os is linux or not... so is it intentional to use the < 2.6 behavior or the post > 2.6 behavior..
Tue Oct 4 21:58:43 2016  mkjellman:what is this code actually trying to do in the first place?
Tue Oct 4 21:58:52 2016  mkjellman:some comments would have been nice here..
Tue Oct 4 21:59:02 2016  mkjellman:this early open stuff is a total shitshow.
Tue Oct 4 22:00:30 2016  thobbs:Joined the channel
Tue Oct 4 22:01:13 2016  mkjellman:it seems to imply that any sstable that is written by SSTableRewriter will tell the kernel not to cache any of it from the start and i don't understand how that could possibly be a good thing... certainly the brand new sstables would be the ones we would want to cache the most, no???
Tue Oct 4 22:01:36 2016  mkjellman:and it seems like given objects are going thru Ref at this point, all of them should be hitting the Tidy code
Tue Oct 4 22:05:28 2016  jmckenzie_:Joined the channel
Tue Oct 4 22:06:42 2016  mstepura:Joined the channel
Tue Oct 4 22:07:02 2016  mkjellman:cc blambov driftx jeffj snazy
Tue Oct 4 22:08:09 2016  driftx:you're out of my wheelhouse
Tue Oct 4 22:09:35 2016  mkjellman:this shit is crazytown..
Tue Oct 4 22:15:02 2016  snazy:so, the TL;DR is: 'posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)' is different for Linux < 2.6.6 and >= 2.6.6 ?
Tue Oct 4 22:17:40 2016  mkjellman:yes
Tue Oct 4 22:17:47 2016  mkjellman:well, that's 1 thing i found out of this
Tue Oct 4 22:18:12 2016  mkjellman:so once we know what the intended behavior was supposed to be we should check the kernel version if we can and make sure we do the correct thing as the api changed
Tue Oct 4 22:18:37 2016  mkjellman:but the bigger one is what is the purpose of this in the first place.. i don't understand why we would tell the kernel to drop the file from the page cache that we just created
Tue Oct 4 22:23:27 2016  adamholmberg:Joined the channel
Tue Oct 4 22:25:37 2016  snazy:hm - the differing behaviour is marked as a bug (http://man7.org/linux/man-pages/man2/posix_fadvise.2.html#BUGS)
Tue Oct 4 22:25:39 2016  jeffj:only time it seems like we should be fadvise dontneed is on ops like compaction, where we're reading through one time
Tue Oct 4 22:25:55 2016  snazy:i found three occurences that use offset==len==0
Tue Oct 4 22:26:25 2016  snazy:one in SSTableRewriter ("guarded" with transaction.isOffline())
Tue Oct 4 22:26:43 2016  snazy:and two in SSTableReader.GlobalTidy#tidy for DATA + INDEX
Tue Oct 4 22:27:40 2016  clohfink_:Joined the channel
Tue Oct 4 22:29:44 2016  mkjellman:snazy: right so in SSTableReader.GlobalTidy#tidy i'm trying to understand why we want to dump it in the first place..
Tue Oct 4 22:31:00 2016  snazy:i guess, the posix_fadvise w/ offset==len==0 calls shall mean that the whole thing should be dropped from the page cache. Not sure, but we keep sstables open as long as possible?
Tue Oct 4 22:33:46 2016  mkjellman:i just made a build and entirely removed posix_fadvise to make trySkipCache a no-op
Tue Oct 4 22:33:50 2016  mkjellman:gonna see what it does to latencies
Tue Oct 4 22:34:10 2016  driftx:interesting.
Tue Oct 4 22:35:15 2016  snazy:found these related tickets: https://issues.apache.org/jira/browse/CASSANDRA-8683 https://issues.apache.org/jira/browse/CASSANDRA-8746 (2nd is probably not very interesting for this)
Tue Oct 4 22:35:23 2016  mkjellman:build is bounced in to my test perf cluster and i restarted the load.. let's see if there is any visible difference in read latencies
Tue Oct 4 22:36:07 2016  snazy:also: https://issues.apache.org/jira/browse/CASSANDRA-6916
Tue Oct 4 22:36:19 2016  mkjellman:i'm seeing reads taking 186ms+ to read 1 byte from Index files
Tue Oct 4 22:36:25 2016  mkjellman:it's murdering the p99.9th percentile
Tue Oct 4 22:37:53 2016  mkjellman:CASSANDRA-6916 seems to be actually the best explanation of what this code is supposed to do
Tue Oct 4 22:37:54 2016  CassBotJr:https://issues.apache.org/jira/browse/CASSANDRA-6916 (Resolved; Fixed; 2.1 beta2): "Preemptive opening of compaction result"
Tue Oct 4 22:40:04 2016  jeffj:also discussed https://issues.apache.org/jira/browse/CASSANDRA-6746
Tue Oct 4 22:40:34 2016  jeffj:"I'm not sure what the correct response to this is. Largely this is simply behaving as expected, except that really issuing a DONTNEED when we probably DO need is not a great idea. The rationale of course is that if we're compacting stale data we don't want to pollute the page cache; but if we're compacting live data we will actively destroy the page cache when the OS listens stringently to the DONTNE
Tue Oct 4 22:40:41 2016  jeffj:ED (which in this case it apparently does even though it has plenty of room to ignore us). Unless we can be smarter about issuing these commands, I think issuing them isn't actually such a great idea, at least not on kernel versions that elicit this behaviour."
Tue Oct 4 22:41:07 2016  mkjellman:nice find jeff.
Tue Oct 4 22:41:23 2016  snazy:could be related to the comment "// don't ideally want to dropPageCache for the file until all instances have been released" - looks like a TODO?
Tue Oct 4 22:41:58 2016  jeffj:which goes all the way back to https://issues.apache.org/jira/browse/CASSANDRA-2635
Tue Oct 4 22:42:03 2016  mkjellman:"don't ideally want" == "todo" as "minor outage" == "hiccup" ? lol
Tue Oct 4 22:47:23 2016  kohlisankalp:Joined the channel
Tue Oct 4 23:03:03 2016  kohlisankalp:Joined the channel
Tue Oct 4 23:20:48 2016  adamholmberg:Joined the channel
Tue Oct 4 23:22:19 2016  adamholmberg:Joined the channel
Tue Oct 4 23:33:13 2016  mstepura:Joined the channel
Tue Oct 4 23:53:32 2016  kohlisankalp:Joined the channel

Comments