Displaying #maven-dev/2017-02-16.log:

Thu Feb 16 08:35:42 2017  tibor_:Joined the channel
Thu Feb 16 10:05:38 2017  olamy:Joined the channel
Thu Feb 16 16:30:40 2017  tibor_:Joined the channel
Thu Feb 16 16:42:35 2017  Michael-O:Joined the channel
Thu Feb 16 16:43:42 2017  Michael-O:tibor_: I'll be ready to talk in ca. 30 min
Thu Feb 16 16:43:52 2017  tibor_:ok
Thu Feb 16 16:47:25 2017  Michael-O:I likely have already noticed that with MCLEAN 3.0.0. a lot more tests fail
Thu Feb 16 16:47:52 2017  Michael-O:You, not I
Thu Feb 16 17:18:00 2017  rfscholte:Joined the channel
Thu Feb 16 17:20:28 2017  Michael-O:tibor_: I am back now
Thu Feb 16 17:21:04 2017  tibor_:clean:3.0.0 failed in other projects than surefire?
Thu Feb 16 17:21:35 2017  tibor_:I do not see any talks in ML. Are there private emails around?
Thu Feb 16 17:21:48 2017  Michael-O:From us?
Thu Feb 16 17:22:05 2017  tibor_:for maven release talks
Thu Feb 16 17:22:16 2017  tibor_:or about us two
Thu Feb 16 17:22:33 2017  Michael-O:there is one from Stephen, pretty long mail chain.
Thu Feb 16 17:22:50 2017  tibor_:in ML you mean?
Thu Feb 16 17:22:53 2017  Michael-O:The rest about Surefire is just between us because we are the only one working on it
Thu Feb 16 17:22:55 2017  Michael-O:Yes, the ML.
Thu Feb 16 17:23:08 2017  tibor_:so where to start?
Thu Feb 16 17:23:22 2017  Michael-O:Let me preate the tarballs first...
Thu Feb 16 17:23:43 2017  tibor_:did we fix some errors?
Thu Feb 16 17:24:04 2017  Michael-O:Yes, the TestNG issue duplicate version is gone
Thu Feb 16 17:24:12 2017  Michael-O:No duplicate properties passed, just 5.7
Thu Feb 16 17:24:44 2017  Michael-O:As well as the WorkingDirectory IT which was caused by the missing flush
Thu Feb 16 17:26:07 2017  Michael-O:Can you transform the duplicate property issue into a branch?
Thu Feb 16 17:38:30 2017  tibor_:yes
Thu Feb 16 17:43:03 2017  tibor_:branch SUREFIRE_SYSPROP_DUPLICATES
Thu Feb 16 17:43:38 2017  Michael-O:Great, now let's talk about the issue. First, we should compare 3.3.9 with master
Thu Feb 16 17:43:44 2017  tibor_:what should i download?
Thu Feb 16 17:44:14 2017  Michael-O:maven-master.tar.gz and maven-3.3.9.tar.gz and oven the log files
Thu Feb 16 17:44:27 2017  Michael-O: /net/home/osipovmi/maven-surefire.log
Thu Feb 16 17:45:02 2017  tibor_:http://home.apache.org/~michaelo/maven/surefire/maven-surefire-extended-patch-maven-3.3.9.tar.gz
Thu Feb 16 17:45:10 2017  tibor_:http://home.apache.org/~michaelo/maven/surefire/maven-surefire-extended-patch-maven-master.tar.gz
Thu Feb 16 17:45:13 2017  tibor_:right?
Thu Feb 16 17:45:19 2017  Michael-O:perfect
Thu Feb 16 17:49:18 2017  tibor_:25 and 26 errors for master and 339
Thu Feb 16 17:50:03 2017  Michael-O:The only difference is:
Thu Feb 16 17:50:05 2017  Michael-O:testForkCountOneReuse(org.apache.maven.surefire.its.ForkModeMultiModuleIT): Exit code was non-zero: 1; command line and log = (..)
Thu Feb 16 17:50:06 2017  Michael-O: testForkCountTwoNoReuse(org.apache.maven.surefire.its.ForkModeTestNGIT): Exit code was non-zero: 1; command line and log = (..)
Thu Feb 16 17:50:15 2017  Michael-O:Should we investigate this?
Thu Feb 16 17:50:31 2017  tibor_:I will check it
Thu Feb 16 17:50:56 2017  tibor_:there is Process Exit Code: 0
Thu Feb 16 17:51:04 2017  tibor_:and the others have
Thu Feb 16 17:51:30 2017  tibor_:Process Exit Code: 1
Thu Feb 16 17:51:55 2017  tibor_:so this might be ForkedBoots call System.exit(1)
Thu Feb 16 17:52:04 2017  tibor_:lets see why
Thu Feb 16 17:52:09 2017  Michael-O:ok...
Thu Feb 16 17:55:05 2017  tibor_:are we sure the old code with OUT variable does not exist, not in the bytecode?
Thu Feb 16 17:55:12 2017  tibor_:in classes/ folder
Thu Feb 16 17:56:35 2017  Michael-O:I always do a clean, let me run git diff
Thu Feb 16 17:56:43 2017  tibor_:ok then
Thu Feb 16 17:57:52 2017  Michael-O:uses of OUT has been commented by you, so no NPE hre
Thu Feb 16 17:57:53 2017  Michael-O:here
Thu Feb 16 18:05:32 2017  tibor_:I have picked up simple test Surefire1080ParallelForkDoubleTestIT
Thu Feb 16 18:05:43 2017  tibor_:there are three places with exit 1
Thu Feb 16 18:05:54 2017  tibor_:commented two and one remains
Thu Feb 16 18:06:06 2017  tibor_:lets see which is the cause
Thu Feb 16 18:06:16 2017  tibor_:again no dump file
Thu Feb 16 18:06:25 2017  tibor_:so we do not know what's going on
Thu Feb 16 18:07:23 2017  Michael-O:We should probably enable dumps and streams on all tests for now
Thu Feb 16 18:09:26 2017  tibor_:you mean the yesterday's dump with OUT?
Thu Feb 16 18:09:53 2017  tibor_:the dumpstream is there always
Thu Feb 16 18:10:00 2017  tibor_:this is enabled
Thu Feb 16 18:10:16 2017  Michael-O:Yes, would that help us in this case? I don't see any in Surefire1080
Thu Feb 16 18:10:17 2017  tibor_:pls see the Surefire1080*IT
Thu Feb 16 18:10:35 2017  tibor_:I do not see target and log.txt
Thu Feb 16 18:11:24 2017  tibor_:how could this happen
Thu Feb 16 18:12:00 2017  Michael-O:surefire-integration-tests/target, all is in the tarball, but no stream.out or dumpstream
Thu Feb 16 18:13:43 2017  tibor_:yes
Thu Feb 16 18:15:03 2017  tibor_:I have suspicion that the shutdown mechanism exited the JVM after timing out 20 seconds
Thu Feb 16 18:15:23 2017  tibor_:this is the part I have commented out in the ForkedBooter
Thu Feb 16 18:15:28 2017  Michael-O:running Surefire1080 with the new class
Thu Feb 16 18:15:54 2017  tibor_:[INFO] Total time: 22.322 s
Thu Feb 16 18:15:58 2017  tibor_:this is it
Thu Feb 16 18:16:14 2017  Michael-O:how can I increase the timeout?
Thu Feb 16 18:16:27 2017  tibor_:the PING event was not send by maven plugin or lost in forked jvm
Thu Feb 16 18:16:41 2017  tibor_:there is a config
Thu Feb 16 18:16:43 2017  tibor_:let me see
Thu Feb 16 18:19:14 2017  tibor_:cannot find it
Thu Feb 16 18:19:45 2017  tibor_:Pls use -Dcheckstyle.skip=true
Thu Feb 16 18:20:26 2017  Michael-O:The modified ForkedBooter makes surefire1080 pass
Thu Feb 16 18:22:01 2017  tibor_:ok so it's this PINGing problem
Thu Feb 16 18:22:56 2017  tibor_:so lets print PING sent by plugin on the console, and let's do the same when received
Thu Feb 16 18:23:18 2017  tibor_:if not seens in console we know where to have a look
Thu Feb 16 18:24:28 2017  Michael-O:good...
Thu Feb 16 18:25:23 2017  tibor_:to make sure I commened only one exit 1
Thu Feb 16 18:25:29 2017  tibor_:sening again
Thu Feb 16 18:26:27 2017  tibor_:email coming and i am going to println()
Thu Feb 16 18:26:55 2017  Michael-O:ok
Thu Feb 16 18:27:00 2017  Michael-O:will patch in a few minute
Thu Feb 16 18:27:02 2017  Michael-O:s
Thu Feb 16 18:37:42 2017  Michael-O:Running master with your new patch
Thu Feb 16 18:41:00 2017  tibor_:ok
Thu Feb 16 18:41:39 2017  tibor_:do you have faster machine today?
Thu Feb 16 18:42:24 2017  Michael-O:Currently the same. The test has passed with your new patch.
Thu Feb 16 18:42:35 2017  Michael-O:I am also preparing a branch new machine with RHEL 6
Thu Feb 16 18:44:22 2017  Michael-O:The test has passed, did expect that now?
Thu Feb 16 18:45:04 2017  tibor_:yes
Thu Feb 16 18:45:17 2017  tibor_:but now we will pass it again
Thu Feb 16 18:45:39 2017  Michael-O:OK, now you will modify some code, right?
Thu Feb 16 18:45:40 2017  tibor_:and print messages from timers
Thu Feb 16 18:45:55 2017  tibor_:so there are two actors
Thu Feb 16 18:46:03 2017  tibor_:plugin process sending NOOP
Thu Feb 16 18:46:11 2017  tibor_:and forked jvm receiving it
Thu Feb 16 18:46:36 2017  tibor_:we will see timestamp, and if we dont see then we know where to have a look
Thu Feb 16 18:50:01 2017  Michael-O:I see, the communication. If this is the case, we can make them more robust.
Thu Feb 16 18:50:33 2017  tibor_:how you see it?
Thu Feb 16 18:50:52 2017  tibor_:what tool you use?
Thu Feb 16 18:51:00 2017  Michael-O:Deducing from your explanation. It would make sense if it turnes out the way you describe.
Thu Feb 16 18:51:15 2017  Michael-O:Because there is not dump or strack trace of the forked VM.
Thu Feb 16 18:51:48 2017  Michael-O:One could also run all VMs with strace or similar and watch for system calls
Thu Feb 16 18:51:59 2017  Michael-O:printf and exit is always logged
Thu Feb 16 18:55:15 2017  tibor_:yes
Thu Feb 16 18:56:20 2017  tibor_:it seems there is no exception
Thu Feb 16 18:57:12 2017  tibor_:the one in log.txt is not the cause
Thu Feb 16 18:58:14 2017  tibor_:the exceptions in log is only consequence of something in this case
Thu Feb 16 18:58:33 2017  Michael-O:Yes, can we enable the PING PONG between VMs?
Thu Feb 16 18:59:11 2017  tibor_:this is there by default and system out is used as a channel
Thu Feb 16 18:59:27 2017  tibor_:we can add a new event, it's enum
Thu Feb 16 18:59:40 2017  khmarbaise:Joined the channel
Thu Feb 16 18:59:52 2017  tibor_:sending back to plugin is more difficult
Thu Feb 16 19:01:02 2017  tibor_:can we see more via jconsole if we e.g. put 1 minute delay in junit tests
Thu Feb 16 19:02:46 2017  khmarbaise:Hi, can someone give me hint what you have checked to produce the log file http://home.apache.org/~michaelo/maven/surefire/MCLEAN-3.0.0-bug/maven-surefire-MCLEAN.tar.gz ? (Unfortunately not there)…
Thu Feb 16 19:05:52 2017  Michael-O:khmarbaise: just a second
Thu Feb 16 19:06:07 2017  Michael-O:tibor_: patched fiels. Which IT have I to run?
Thu Feb 16 19:06:23 2017  tibor_:same as previous
Thu Feb 16 19:06:30 2017  khmarbaise:np
Thu Feb 16 19:06:31 2017  Michael-O:khmarbaise: have a look at the branch MNG-6169 first, diff against master
Thu Feb 16 19:06:59 2017  khmarbaise:Ok..
Thu Feb 16 19:07:12 2017  Michael-O:you should see that I have updated some plugins in the lifecycle binding
Thu Feb 16 19:07:58 2017  Michael-O:khmarbaise: files have been consolidated here: http://home.apache.org/~michaelo/maven/surefire/
Thu Feb 16 19:08:15 2017  khmarbaise:Ok..thanks.
Thu Feb 16 19:08:58 2017  Michael-O:download both tarballs with MNG-6169
Thu Feb 16 19:12:44 2017  Michael-O:tibor_: test has passed, output is there
Thu Feb 16 19:13:07 2017  tibor_:send me some log
Thu Feb 16 19:13:37 2017  Michael-O:sent
Thu Feb 16 19:15:40 2017  Michael-O:khmarbaise: do you have them now?
Thu Feb 16 19:15:50 2017  khmarbaise:Yes I have them...
Thu Feb 16 19:16:09 2017  khmarbaise:Are those log files from Maven Core integration tests ?
Thu Feb 16 19:17:32 2017  Michael-O:no, Surefire master. I ran Maven master on Christian's request
Thu Feb 16 19:17:42 2017  Michael-O:you have two maven-surefire.log files
Thu Feb 16 19:18:01 2017  khmarbaise:Ah Ok..I see…with current Maven Core master (3.5.0)…
Thu Feb 16 19:18:10 2017  Michael-O:MNG-6169-MCLEAN-3.0.0 is identical, but the plugin upgraded from 2.6.1 to 3.0.0
Thu Feb 16 19:18:48 2017  Michael-O:Compare the last hundred lines of the fire: failed tests, you will see that with 3.0.0 some more tests fail Surefire772*IT
Thu Feb 16 19:19:18 2017  khmarbaise:now i’ve taken a look into it…Ok….
Thu Feb 16 19:22:12 2017  tibor_:Michael, do you see the lines I printed to console log?
Thu Feb 16 19:23:21 2017  Michael-O:Yes, ForkStarter NOOP
Thu Feb 16 19:23:29 2017  Michael-O:CommandReader NOOP, etc.
Thu Feb 16 19:23:42 2017  tibor_:can you send me?
Thu Feb 16 19:23:51 2017  Michael-O:I did to your private mail
Thu Feb 16 19:24:14 2017  Michael-O:tibor17...
Thu Feb 16 19:24:31 2017  tibor_:ah ok that one
Thu Feb 16 19:24:49 2017  Michael-O:khmarbaise: do you see the diff in the failures based on the log files?
Thu Feb 16 19:24:54 2017  khmarbaise:sure
Thu Feb 16 19:27:26 2017  Michael-O:Any idea how we can investigate that diff?
Thu Feb 16 19:27:33 2017  Michael-O:3.0.1-SNAPSHOT fails also for this test
Thu Feb 16 19:29:53 2017  Michael-O:tibor_: 12 cores are now working on RHEL6 :-d
Thu Feb 16 19:30:09 2017  khmarbaise:Not yet really but give me some time to dive more deeply into that…
Thu Feb 16 19:30:36 2017  Michael-O:ok, sure
Thu Feb 16 19:30:39 2017  tibor_:Previously when the Surefire1080ParallelForkDoubleTestIT was running it was 1 core CPU?
Thu Feb 16 19:31:08 2017  Michael-O:no, 4 core on FreeBSD, I have added now a 12-core machine with RHEL6 for testing
Thu Feb 16 19:31:10 2017  tibor_:I mean previously when all ITs were running
Thu Feb 16 19:31:42 2017  tibor_:because now [INFO] Total time: 4.911 s
Thu Feb 16 19:32:07 2017  Michael-O:no no, all files now are still on 4-core
Thu Feb 16 19:32:12 2017  tibor_:Is it possible that the GC consumed CPU usage so much?
Thu Feb 16 19:32:35 2017  Michael-O:maybe, but this would require us to enable GC log also
Thu Feb 16 19:32:38 2017  tibor_:because the NOOP event appears
Thu Feb 16 19:34:00 2017  tibor_:if you see the log you sent me the timestamp and the difference between start and end is cca 0.6 sec
Thu Feb 16 19:34:29 2017  tibor_:and this did not elapse the 20 sec timeout of not received NOOP
Thu Feb 16 19:34:38 2017  tibor_:or maybe
Thu Feb 16 19:34:41 2017  tibor_:OOME?
Thu Feb 16 19:34:54 2017  tibor_:and test recovered somehow?
Thu Feb 16 19:34:59 2017  Michael-O:If you, this should be logged somewhere
Thu Feb 16 19:35:08 2017  Michael-O:note that this is a 32 bit VM
Thu Feb 16 19:35:30 2017  tibor_:these test start via Verifier
Thu Feb 16 19:35:33 2017  Michael-O:We can, of course assign more memory to the VMs during test
Thu Feb 16 19:35:36 2017  tibor_:how to add GC log
Thu Feb 16 19:36:03 2017  tibor_:i am thinking of memory of forked jvm
Thu Feb 16 19:36:14 2017  Michael-O:http://stackoverflow.com/a/895489/696632
Thu Feb 16 19:36:19 2017  tibor_:becuase all of these tests which fail are about forking only
Thu Feb 16 19:36:40 2017  Michael-O:Right but the VM writes that it could not allocate native memory
Thu Feb 16 19:37:53 2017  tibor_:we can add verbose gc to MavenLauncher and this will be global in all tests
Thu Feb 16 19:37:59 2017  tibor_:WDYT?
Thu Feb 16 19:38:25 2017  tibor_:and we can do the saem in MAVEN_OPTS
Thu Feb 16 19:38:57 2017  Michael-O:Do you really think it could be memory? What confuses me most is that we have exit code 1, but now sterr from the VM why it failed
Thu Feb 16 19:39:12 2017  tibor_:ah
Thu Feb 16 19:39:14 2017  tibor_:ok
Thu Feb 16 19:39:23 2017  tibor_:now the test is faster
Thu Feb 16 19:39:28 2017  tibor_:I do not know why
Thu Feb 16 19:39:29 2017  tibor_:but
Thu Feb 16 19:39:48 2017  tibor_:lets uncomment the line in ForkedBooter
Thu Feb 16 19:40:38 2017  tibor_:I do not see 'ForkedBooter PING timer: plugin did not send me NOOP signal > exit'
Thu Feb 16 19:41:09 2017  tibor_:which is good because this line means that going to call exit(1) which is commented out at the moment
Thu Feb 16 19:41:36 2017  tibor_:so it means all the mechanism works
Thu Feb 16 19:41:48 2017  tibor_:but something breaks it
Thu Feb 16 19:41:52 2017  Michael-O:That is really good.
Thu Feb 16 19:42:07 2017  tibor_:something breaks it when the test takes long time
Thu Feb 16 19:42:12 2017  tibor_:this is the problem
Thu Feb 16 19:42:20 2017  tibor_:maybe OutOfMemoryError
Thu Feb 16 19:42:22 2017  Michael-O:Lets really capture all output again
Thu Feb 16 19:42:28 2017  tibor_:and consumed
Thu Feb 16 19:42:33 2017  tibor_:in a Threads
Thu Feb 16 19:42:39 2017  Michael-O:OOME must be written to stdrr
Thu Feb 16 19:42:40 2017  tibor_:or something else
Thu Feb 16 19:42:48 2017  Michael-O:Can you add this code again?
Thu Feb 16 19:42:54 2017  tibor_:sure
Thu Feb 16 19:43:37 2017  tibor_:sending email
Thu Feb 16 19:46:42 2017  tibor_:If you see these tests launched bySurefire1080*IT
Thu Feb 16 19:46:43 2017  tibor_:https://github.com/apache/maven-surefire/tree/master/surefire-integration-tests/src/test/resources/surefire-1080-parallel-fork-double-test/src/test/java/com/cal
Thu Feb 16 19:46:52 2017  tibor_:they are empty test methods
Thu Feb 16 19:47:30 2017  tibor_:so no reason that the forked jvm runs 22 seconds
Thu Feb 16 19:48:24 2017  Michael-O:running tests with new patch
Thu Feb 16 19:49:01 2017  Michael-O:maybe either a GC pause or a context switch by the OS
Thu Feb 16 19:52:45 2017  tibor_:how many tests yuo run now?
Thu Feb 16 19:53:15 2017  Michael-O:Single test completed successfully on FreeBSD, full test suite on RHEL6
Thu Feb 16 19:54:10 2017  Michael-O:The changed exit(1) passed the test
Thu Feb 16 19:54:32 2017  tibor_:and the test took 5 sec again?
Thu Feb 16 19:54:45 2017  Michael-O:6,85 s
Thu Feb 16 19:55:34 2017  tibor_:is it the same platform when we had [INFO] Total time: 22.322 s before?
Thu Feb 16 19:56:39 2017  Michael-O:Yes, of course:
Thu Feb 16 19:56:41 2017  Michael-O:Running org.apache.maven.surefire.its.jiras.Surefire1080ParallelForkDoubleTestIT
Thu Feb 16 19:56:43 2017  Michael-O:Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 24.829 sec <<< FAILURE!
Thu Feb 16 19:56:47 2017  Michael-O:so the failing run was > 20 s
Thu Feb 16 19:56:55 2017  tibor_:There was Jira issue where the user wanted me to configure the period of sending NOOP.
Thu Feb 16 19:56:58 2017  Michael-O:Are we really running in a timeout issue?
Thu Feb 16 19:57:20 2017  tibor_:not yet
Thu Feb 16 19:57:27 2017  tibor_:because timeout is 20 sec
Thu Feb 16 19:57:56 2017  tibor_:but it is not the timeout of how long the fork may run
Thu Feb 16 19:58:31 2017  Michael-O:Similar:
Thu Feb 16 19:58:33 2017  Michael-O:Running org.apache.maven.surefire.its.ForkModeIT
Thu Feb 16 19:58:34 2017  Michael-O:Tests run: 12, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 111.837 sec <<< FAILURE!
Thu Feb 16 19:58:36 2017  Michael-O:testForkCountTwoNoReuse(org.apache.maven.surefire.its.ForkModeIT) Time elapsed: 0 sec <<< ERROR!
Thu Feb 16 19:58:36 2017  tibor_:it is a 200 % period of time how often the NOOP event should be sent
Thu Feb 16 19:58:38 2017  Michael-O:112 seconds
Thu Feb 16 19:59:14 2017  tibor_:let's send me the log
Thu Feb 16 19:59:33 2017  Michael-O:You have it, maven-master.tar.gz
Thu Feb 16 19:59:35 2017  Michael-O:it is all there
Thu Feb 16 20:00:13 2017  Michael-O:note: all tests just passed on the fast machine with RHEL6, Java 8 Update 121, Maven master and 12 cores.
Thu Feb 16 20:00:26 2017  Michael-O:This may really be a timing issue in the communication between plugin and VM
Thu Feb 16 20:01:06 2017  Michael-O:But what if a test takes longer than the expected comm timeout? Will the fork be aborted?
Thu Feb 16 20:02:21 2017  tibor_:no
Thu Feb 16 20:02:45 2017  tibor_:becuae this is not timeout of test lifecycle
Thu Feb 16 20:03:06 2017  tibor_:it is a period 10 seconds when NOOP event is sent from Maven to forked JVM
Thu Feb 16 20:03:50 2017  Michael-O:What if it takes longer?
Thu Feb 16 20:04:07 2017  tibor_:and fork is waiting max two periods after which if not received NOOP then it means Maven was killed in Jenkins and forked jvm kills itself
Thu Feb 16 20:04:48 2017  Michael-O:Ah I see, how can we increase that for a test?
Thu Feb 16 20:04:48 2017  tibor_:Without this mechanism the processes leak and may block IO on file system
Thu Feb 16 20:05:02 2017  tibor_:it is hard coded
Thu Feb 16 20:05:16 2017  tibor_:but can be configured
Thu Feb 16 20:05:23 2017  tibor_:but this takes time
Thu Feb 16 20:05:35 2017  Michael-O:You don't know the spot, do you?
Thu Feb 16 20:05:37 2017  tibor_:or I can change the period
Thu Feb 16 20:05:46 2017  tibor_:in Java code
Thu Feb 16 20:06:20 2017  Michael-O:Let's do and try
Thu Feb 16 20:06:23 2017  tibor_:we can do that but we won't know the rootcause
Thu Feb 16 20:06:46 2017  Michael-O:What if this is the case?
Thu Feb 16 20:06:49 2017  tibor_:so how long period to send NOOP
Thu Feb 16 20:06:51 2017  Michael-O:It is at least worh a try
Thu Feb 16 20:07:10 2017  tibor_:60 sec?
Thu Feb 16 20:07:12 2017  Michael-O:With you debug out put we should see times in previously failed test, right?
Thu Feb 16 20:07:23 2017  tibor_:yes
Thu Feb 16 20:07:31 2017  Michael-O:let's go 30 first
Thu Feb 16 20:07:35 2017  tibor_:ok
Thu Feb 16 20:07:40 2017  Michael-O:Which classes do I have to changes?
Thu Feb 16 20:07:46 2017  Michael-O:Forkstarter:121?
Thu Feb 16 20:08:02 2017  Michael-O:ThreadStreamConsumer:42?
Thu Feb 16 20:08:08 2017  tibor_:yes
Thu Feb 16 20:08:14 2017  tibor_:and ForkedBooter
Thu Feb 16 20:08:26 2017  tibor_:L72
Thu Feb 16 20:08:56 2017  Michael-O:on my way
Thu Feb 16 20:08:57 2017  tibor_:nothing in ThreadStreamConsumer
Thu Feb 16 20:09:21 2017  tibor_:ITEM_LIMIT_BEFORE_SLEEP is different story :)
Thu Feb 16 20:09:26 2017  Michael-O:ok
Thu Feb 16 20:10:24 2017  tibor_:ITEM_LIMIT_BEFORE_SLEEP is handy if the Arquillian sends thousands of logs each second or very fast
Thu Feb 16 20:11:01 2017  Michael-O:OK, changed both values to 30
Thu Feb 16 20:11:05 2017  tibor_:it prevents from growing the queue infinitively
Thu Feb 16 20:11:07 2017  Michael-O:running all tests again
Thu Feb 16 20:11:14 2017  tibor_:moment
Thu Feb 16 20:11:21 2017  Michael-O:yes?
Thu Feb 16 20:11:51 2017  tibor_:ForkedBooter must be 2times longer than period in ForkStarter
Thu Feb 16 20:12:35 2017  tibor_:you know ?
Thu Feb 16 20:12:58 2017  Michael-O:ok, now I have 30 and 60 set
Thu Feb 16 20:13:04 2017  tibor_:right
Thu Feb 16 20:13:05 2017  Michael-O:Can I run them now?
Thu Feb 16 20:13:08 2017  tibor_:lets go
Thu Feb 16 20:13:44 2017  Michael-O:Btw, what do I need to pass to surefire to run several tests in parallel instead of single mode?
Thu Feb 16 20:14:08 2017  tibor_:I have never done this
Thu Feb 16 20:14:12 2017  Michael-O:khmarbaise: I will leave for 45 min, any questions?
Thu Feb 16 20:14:19 2017  Michael-O:ok, I will leave as is
Thu Feb 16 20:14:22 2017  khmarbaise:I diving into ...
Thu Feb 16 20:14:27 2017  khmarbaise:NP...
Thu Feb 16 20:14:33 2017  tibor_:because the build gos to 80 - 90 % CPU usage already
Thu Feb 16 20:15:43 2017  tibor_:and we have performance tests, and if the sleep() was prolonged by 0.5 second due to overloaded CPU, then the test fails
Thu Feb 16 20:16:20 2017  tibor_:it is JunitCoreParallelIT or something, running long time cca 5 min
Thu Feb 16 20:17:43 2017  tibor_:sorry JUnit47ParallelIT, 59 test methods. Heavy test.
Thu Feb 16 20:19:00 2017  tibor_:Michael-O: would we add -verbose:gc afterwards?
Thu Feb 16 20:20:40 2017  tibor_:Maybe we would need to make fine tuning with this period.
Thu Feb 16 20:23:33 2017  Michael-O:Joined the channel
Thu Feb 16 20:33:00 2017  tibor_:all of the tests, except for JUnit4RerunFailingTestsIT, trigger parallel forks
Thu Feb 16 21:05:13 2017  tibor_:Michael-O: How much total RAM memory is on your FreeBSD and how much is for our build?
Thu Feb 16 21:14:48 2017  Michael-O:tibor_: Maven uses default value from the VM
Thu Feb 16 21:15:55 2017  Michael-O:The machine has 3,25 GiB RAM and 4 GiB of swap
Thu Feb 16 21:16:25 2017  Michael-O:Intel Xeon E5440 @ 2,83 GHz
Thu Feb 16 21:16:27 2017  tibor_:I made fine tuning on x64: Xmx 750 MB and MaxPermSize cca 500MB, and each fork 200MB
Thu Feb 16 21:16:30 2017  Michael-O:4 cores
Thu Feb 16 21:17:40 2017  tibor_:is the latest version of Maven eating more memory than let's say 3.1 or 3.2.1?
Thu Feb 16 21:21:00 2017  Michael-O: I did not check, I ran 3.3.9 and master only. I can try 3.2.5 too tomorrow and see what the memory output after build date says
Thu Feb 16 21:34:42 2017  tibor_:Now the build is going to finish?
Thu Feb 16 21:46:59 2017  Michael-O:tibor_: build is done already
Thu Feb 16 21:47:20 2017  Michael-O:No change, but all of the failing tests ran more than 60 seconds, 90, 180, etc.
Thu Feb 16 21:47:29 2017  Michael-O:So they may be subject to the timeout
Thu Feb 16 21:48:32 2017  tibor_:it sounds to me there is non-daemon thared in each fork
Thu Feb 16 21:48:39 2017  tibor_:thread
Thu Feb 16 21:49:09 2017  tibor_:but surefire uses only daemon threads
Thu Feb 16 21:49:28 2017  Michael-O:Can we check this somehow?
Thu Feb 16 21:49:40 2017  tibor_:using jconsole
Thu Feb 16 21:50:03 2017  tibor_:when we tested with single IT it was ok
Thu Feb 16 21:50:42 2017  Michael-O:There is no way to dump with the forker?
Thu Feb 16 21:51:18 2017  Michael-O:kill -3 JAVA_PID
Thu Feb 16 21:51:28 2017  Michael-O:jstack -l JAVA_PID
Thu Feb 16 21:51:57 2017  Michael-O:so I need to run a single IT, wait for it to fail, run again and dump threads, right?
Thu Feb 16 21:52:06 2017  tibor_:in java we can dump the threads
Thu Feb 16 21:52:23 2017  tibor_:yes
Thu Feb 16 21:52:27 2017  Michael-O:from within Java?
Thu Feb 16 21:52:28 2017  tibor_:was it Surefire1080ParallelForkDoubleTestIT before?
Thu Feb 16 21:52:47 2017  Michael-O:Yes
Thu Feb 16 21:52:50 2017  tibor_:I think is last for 5 sec
Thu Feb 16 21:53:13 2017  Michael-O:Single 6, in general > 20
Thu Feb 16 21:54:22 2017  Michael-O:ForkModeIT took 192 seconds
Thu Feb 16 21:54:32 2017  Michael-O:I don't expect this to be much faster in single mode
Thu Feb 16 21:55:15 2017  tibor_:I am thinking
Thu Feb 16 21:55:28 2017  tibor_:what is rootcause
Thu Feb 16 21:55:43 2017  Michael-O:Check master-2.tar.gz
Thu Feb 16 21:55:51 2017  Michael-O:this has tripled timeouts
Thu Feb 16 21:56:11 2017  Michael-O:I am running ForkModeIT right now
Thu Feb 16 22:03:33 2017  tibor_:What it makes difference if we run all tests and one.
Thu Feb 16 22:03:52 2017  Michael-O:This IT failed, I have now tripled the timeout from 30 to 90 and will see in a few minutes wether it fails again
Thu Feb 16 22:04:07 2017  Michael-O:We'll know wether it is timeout related or not and I will run jstrack on it
Thu Feb 16 22:09:16 2017  tibor_:The problem is that now we do not see System.out.println(System.currentTimeMillis() + " CommandReader " + command.getCommandType());
Thu Feb 16 22:09:41 2017  Michael-O:I have the threads of counttworeuse
Thu Feb 16 22:09:42 2017  tibor_:the NOOP event is not sent or not received
Thu Feb 16 22:10:19 2017  tibor_:well, it is sent in ForkStarter
Thu Feb 16 22:10:28 2017  tibor_:so we not receive it
Thu Feb 16 22:11:01 2017  Michael-O:I have uploaded two new log files
Thu Feb 16 22:11:04 2017  Michael-O:with the thread dump
Thu Feb 16 22:11:25 2017  Michael-O:countTwoReuse case
Thu Feb 16 22:13:51 2017  Michael-O:Joined the channel
Thu Feb 16 22:14:32 2017  tibor_:"last-ditch-daemon-shutdown-thread-0s"
Thu Feb 16 22:14:39 2017  tibor_:0s is wrong
Thu Feb 16 22:14:45 2017  tibor_:0 period
Thu Feb 16 22:18:28 2017  Michael-O:what should it say?
Thu Feb 16 22:21:35 2017  tibor_:maybe good catch
Thu Feb 16 22:21:54 2017  tibor_:I am just watching ForkBooter
Thu Feb 16 22:23:15 2017  Michael-O:systemExitTimeoutInSeconds =
Thu Feb 16 22:23:17 2017  Michael-O: providerConfiguration.systemExitTimeout( DEFAULT_SYSTEM_EXIT_TIMEOUT_IN_SECONDS );
Thu Feb 16 22:23:23 2017  Michael-O:someone passes 0 here
Thu Feb 16 22:24:57 2017  Michael-O:but it should be 30 when retrieved here createJvmTerminator()
Thu Feb 16 22:26:42 2017  tibor_:yes
Thu Feb 16 22:26:55 2017  tibor_:order of variables initialization
Thu Feb 16 22:26:57 2017  Michael-O:it is zero again here
Thu Feb 16 22:27:37 2017  Michael-O:Can you check on your machine, it is really 30?
Thu Feb 16 22:28:13 2017  tibor_:createJvmTerminator() reads systemExitTimeoutInSeconds not yet initialized
Thu Feb 16 22:28:37 2017  Michael-O:right and it is volatile
Thu Feb 16 22:31:49 2017  Michael-O:But why did it work up until now?
Thu Feb 16 22:34:08 2017  tibor_:no idea. Anyway this must be fixed later
Thu Feb 16 22:34:10 2017  tibor_:but
Thu Feb 16 22:34:15 2017  tibor_:this is bad:
Thu Feb 16 22:34:17 2017  tibor_:1487277444079 ForkStarter NOOP before a loop 1487277444079 ForkStarter NOOP sent in a loop 1487277444081 ForkStarter NOOP sent in a loop 1487277474119 ForkStarter NOOP before a loop 1487277474120 ForkStarter NOOP sent in a loop 1487277474120 ForkStarter NOOP sent in a loop 1487277504093 ForkStarter NOOP before a loop 1487277504093 ForkStarter NOOP sent in a loop 1487277504093 ForkStarter NOOP sent in a loop [WARNING] 1487277444318
Thu Feb 16 22:34:40 2017  tibor_:see the log ForkModeIT_testForkModeOncePerThreadTwoThreads
Thu Feb 16 22:34:48 2017  tibor_:[INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
Thu Feb 16 22:34:58 2017  tibor_:and no log from CommandReader
Thu Feb 16 22:35:27 2017  tibor_:the Thread which is responsible for reading what test class to execute does not run
Thu Feb 16 22:35:41 2017  tibor_:it is the same threads which should handle NOOP
Thu Feb 16 22:36:32 2017  tibor_:surefire-forkedjvm-command-thread
Thu Feb 16 22:37:22 2017  Michael-O:So the communication is unidrectional
Thu Feb 16 22:37:32 2017  tibor_:The thread is waiting for data from std/in (MasterProcessCommand.java:116)
Thu Feb 16 22:40:17 2017  Michael-O:180/360 combo fails too, it is not the timeout, but every test says: ForkedBooter PING timer: plugin did not send me NOOP signal > exit
Thu Feb 16 22:40:22 2017  tibor_:there is a problem with cross process communication
Thu Feb 16 22:40:38 2017  tibor_:yes
Thu Feb 16 22:40:41 2017  tibor_:this is it
Thu Feb 16 22:41:04 2017  tibor_:the plugin sent the signal, but the forked jvm is locked to receive data
Thu Feb 16 22:41:26 2017  tibor_:look at the thread dump
Thu Feb 16 22:41:29 2017  Michael-O:I have updated master-3.tar.gz
Thu Feb 16 22:41:33 2017  tibor_:"surefire-forkedjvm-command-thread" #10 daemon prio=5 os_prio=15 tid=0x29eb4c00 nid=0x18c57 runnable [0xbb58e000]
Thu Feb 16 22:42:27 2017  Michael-O:yes, it is running and waiting for bytes to read
Thu Feb 16 22:43:44 2017  tibor_:exactly
Thu Feb 16 22:44:29 2017  tibor_:maybe something with maven-shared-utils which has the facility responsible for such communication
Thu Feb 16 22:45:56 2017  Michael-O:Could the BufferedReader block us?
Thu Feb 16 22:46:09 2017  Michael-O:buffer is not full enough or so?
Thu Feb 16 22:49:14 2017  tibor_:this is internal jdk implementation
Thu Feb 16 22:49:28 2017  tibor_:it should not block us
Thu Feb 16 22:49:53 2017  tibor_:we can put println() around: DataInputStream stdIn = new DataInputStream( System.in );
Thu Feb 16 22:50:07 2017  tibor_:and see if we at least first time received something
Thu Feb 16 22:50:07 2017  Michael-O:tibor_: I will leave now. If you have anything valuable, just email me. I wil ltest it tormorrow in the machine again. We now have two new issue: the order of calls and the lack of communication
Thu Feb 16 22:50:21 2017  tibor_:ok, good night
Thu Feb 16 22:50:24 2017  Michael-O:n8
Thu Feb 16 22:50:24 2017  tibor_:and thx
Thu Feb 16 22:50:31 2017  Michael-O:welcome
Thu Feb 16 22:50:41 2017  Michael-O:maybe Karl Heinz has some news for us tomorrow

Comments