태터데스크 관리자

도움말
닫기
적용하기   첫페이지 만들기

태터데스크 메시지

저장하였습니다.

Sun Java Forum - HotSpot 최적화에 의한 착각...

Enterprise Java 2007.10.03 23:38
HotSpot JVM의 가장 큰 장점은 "컴파일 최적화"에 있다. 전통적인 JIT 컴파일러의 장점은 그대로 살리면서 단점(늦은 구동시간과 비효율적인 최적화)을 해소하는 것이 HotSpot Compiler의 주 설계 목적이다.

하지만 이 와중에 우리의 시각을 혼란스럽게 만드는 요소가 생긴다.
HotSpot JVM의 지능적이고 자동화된 최적화 기법 때문에 Warming-Up 시간이 필요해진다. 즉 JVM이 바이트코드로부터 최적의 바이트코드나 바이러리 코드를 찾아내는데 일정한 시간이 필요하다. 이 때문에 최초 얼마간의 성능 측정만으로 성능을 좋다 나쁘다를 결정하는 것은 위험할 수 있다.

동일한 구현 방법에 대해서 JVM이 C/C++과 거의 비슷하거나 나은 정도의 성능을 내는데 10분 정도의 시간이 소요되었다는 연구 결과를 본 적이 있다.

아래 논의도 비슷한 내용을 담고 있다. 분명이 성능이 더 좋아야 할 코드가 실제로는 나쁜 성능을 보이는 이유에 대해서 논의한 결과이다. 잠정적인 결론은 HotSpot JVM이 최적의 코드를 생산해내는데 걸리는 Warming-Up 시간으로 인해 초기에는 그런 모순된 결과가 나올 수 있다는 것이다.

섣부른 성능 테스트라는 것이 얼마나 많은 오류를 내포하고 있는지 알 수 있는 기회이기도 하다.

UMTS
Posts:3
Registered: 9/21/07
naive for loops runs faster than "optimized"?
Sep 21, 2007 7:14 AM

well,
I remember certification question (310-025):
what piece of code runs faster:
1.
for(int i=0; i
r = r + s.charAt(i);
}
2.
int l = s.length();
for(int i=0; i
r = r + s.charAt(i);
}

well anybody would say that 2 should be at least not worse than 1. We eliminate repeatable call to length() method of string.
I was really surprised that in 1.6.0 the 1 run is faster (not much but about 10%-15%).
I could understand that method inline-*** could make 1. to be of thesame performance that achieved in 2. but being FASTER than 2 is reallyextraordinary behaviour. Could explain why?

I looked at piece of code (java.util.regex.Mathcer):
/**
* Returns a literal replacement String for the specified
* String.
*
* This method produces a String that will work
* as a literal replacement s in the
* appendReplacement method of the {@link Matcher} class.
* The String produced will match the sequence of characters
* in s treated as a literal sequence. Slashes ('') and
* dollar signs ('$') will be given no special meaning.
*
* @param s The string to be literalized
* @return A literal string replacement
* @since 1.5
*/
public static String quoteReplacement(String s) {
if ((s.indexOf('
') == -1) && (s.indexOf('$') == -1))
return s;
StringBuffer sb = new StringBuffer();
for (int i=0; i
char c = s.charAt(i);
if (c == '
') {
sb.append('
'); sb.append('
');
} else if (c == '$') {
sb.append('
'); sb.append('$');
} else {
sb.append(c);
}
}
return sb.toString();
}
and think that having s.length() as a variable would make code faster,but making small experiment showed me that I might be wrong :(
Dion_Cho
Posts:40
Registered: 9/6/07
Re: naive for loops runs faster than "optimized"?
Sep 21, 2007 9:01 AM (reply 1 of 4)


Hm... weird. My code tells the opposite.

See following:

c:Program FilesJavajdk1.6.0_02bin>java -version
java version "1.6.0_02"
Java(TM) SE Runtime Environment (build 1.6.0_02-b06)
Java HotSpot(TM) Client VM (build 1.6.0_02-b06, mixed mode)

c:Program FilesJavajdk1.6.0_02bin>java -cp . test
1. Elapsed time = 1468 <-- for(int i=0; i
2. Elapsed time = 1156 <-- for(int i=0; i
3. Elapsed time = 1156 <-- for(int i=0; i
4. Elapsed time = 1156 <-- for(int i=0; i<30; i++) s.length() == 30

My test shows that naive loop is slowest and optimized code runs faster. This is natural result.

To make it clear, i recommend you review your test code. If youcan't find any flaw, better post your source code. In that case, mytest code could have flaw.

Anyway, i'm not sure that String.length method is inlined by hotspot jvm.
Hprof shows that: (-Xrunhprof:cpu=times)

rank self accum count trace method
1 50.34% 50.34% 1 303982 test.main
2 31.10% 81.45% 400000 300848 java.lang.String.charAt
3 5.65% 87.09% 102001 300847 java.lang.String.length
4 0.58% 87.67% 12 302704 sun.misc.ASCIICaseInsensitiveComparator.compare
5 0.58% 88.24% 129 302697
...

Here you see String.length method being invoked, which means it's not inlined.

(Note) For String.length inlining, refer to http://java.sun.com/developer/onlineTraining/Programming/JDCBook/perf2.html

====================================================
public class test {

public static void main(String[] args) {

int r1 = 0;
int r2 = 0;
int r3 = 0;

String s = "123456789012345678901234567890";

long startTime = System.currentTimeMillis();

for(int idx=0; idx < 10000000; idx++) {
for(int i=0; i
r1 = r1 + s.charAt(i);
}
}

System.out.printf("1. Elapsed time = %dn", System.currentTimeMillis() - startTime);

startTime = System.currentTimeMillis();

int l = s.length();
for(int idx=0; idx < 10000000; idx++) {
for(int i=0; i
r2 = r2 + s.charAt(i);
}
}

System.out.printf("2. Elapsed time = %dn", System.currentTimeMillis() - startTime);

startTime = System.currentTimeMillis();

for(int idx=0; idx < 10000000; idx++) {
l = s.length();
for(int i=0; i
r2 = r2 + s.charAt(i);
}
}

System.out.printf("3. Elapsed time = %dn", System.currentTimeMillis() - startTime);

startTime = System.currentTimeMillis();
for(int idx=0; idx < 10000000; idx++) {
for(int i=0; i<30; i++) {
r3 = r3 + s.charAt(i);
}
}

System.out.printf("4. Elapsed time = %dn", System.currentTimeMillis() - startTime);
}
}

Edited by: Dion_Cho on Sep 21, 2007 6:24 PM

UMTS
Posts:3
Registered: 9/21/07
Re: naive for loops runs faster than "optimized"?
Sep 22, 2007 12:04 AM (reply 2 of 4)

What you say is just what I would expect, but:
C:alexprototypestringbin>java -version
java version "1.6.0_02"
Java(TM) SE Runtime Environment (build 1.6.0_02-b05)
Java HotSpot(TM) Client VM (build 1.6.0_02-b05, mixed mode, sharing)

C:alexprototypestringbin>java TestA
A: value = 20467200000, elaplsed (seconds): 4.040892348

C:alexprototypestringbin>java TestB
B: value = 20467200000, elaplsed (seconds): 5.67969728

C:alexprototypestringbin>
and code:
public class TestA {

/**
* @param args
*/
public static void main(String[] args) {
String s = "";
for(int j = 0; j<10000; j++) {
s = s + j;
}
long t1 = System.nanoTime();
long r = 0;
for(int j = 0; j<10000; j++) {
for(int i = 0; i < s.length(); i++) {
r = r + s.charAt(i);
}
}
long t2 = System.nanoTime();
System.out.println("A: value = " + r + ", elaplsed (seconds): " + (t2-t1)/1000000000.0);
}

}

-- and B which supposed to be faster --
public class TestB {

/**
* @param args
*/
public static void main(String[] args) {
String s = "";
for(int j = 0; j<10000; j++) {
s = s + j;
}
int l = s.length();
long t1 = System.nanoTime();
long r = 0;
for(int j = 0; j<10000; j++) {
for(int i = 0; i < l; i++) {
r = r + s.charAt(i);
}
}
long t2 = System.nanoTime();
System.out.println("B: value = " + r + ", elaplsed (seconds): " + (t2-t1)/1000000000.0);
}

}

How would you explain that?

But look also at that, I have changed the order of tests in your code:
public class TestTest {

public static void main(String[] args) {

int r1 = 0;
int r2 = 0;
int r3 = 0;

String s = "123456789012345678901234567890";

long startTime = System.currentTimeMillis();

int l = s.length();
for (int idx = 0; idx < 10000000; idx++) {
for (int i = 0; i < l; i++) {
r2 = r2 + s.charAt(i);
}
}

System.out.printf("2. Elapsed time = %dn", System.currentTimeMillis()
- startTime);

startTime = System.currentTimeMillis();

for (int idx = 0; idx < 10000000; idx++) {
l = s.length();
for (int i = 0; i < l; i++) {
r2 = r2 + s.charAt(i);
}
}

System.out.printf("3. Elapsed time = %dn", System.currentTimeMillis()
- startTime);

startTime = System.currentTimeMillis();
for (int idx = 0; idx < 10000000; idx++) {
for (int i = 0; i < s.length(); i++) {
r1 = r1 + s.charAt(i);
}
}

System.out.printf("1. Elapsed time = %dn", System.currentTimeMillis()
- startTime);

startTime = System.currentTimeMillis();
for (int idx = 0; idx < 10000000; idx++) {
for (int i = 0; i < 30; i++) {
r3 = r3 + s.charAt(i);
}
}

System.out.printf("4. Elapsed time = %dn", System.currentTimeMillis()
- startTime);
}
}

and got:
2. Elapsed time = 4140
3. Elapsed time = 922
1. Elapsed time = 1109
4. Elapsed time = 1282

Edited by: UMTS on Sep 22, 2007 12:11 AM

Dion_Cho
Posts:40
Registered: 9/6/07
Re: naive for loops runs faster than "optimized"?
Sep 22, 2007 1:17 AM (reply 3 of 4)

Here goes my result(of your codes):
(my laptop outperforms yours :) )

c:Program FilesJavajdk1.6.0_02bin>java -cp . TestA
A: value = 20467200000, elaplsed (seconds): 2.025779812

c:Program FilesJavajdk1.6.0_02bin>java -cp . TestB
B: value = 20467200000, elaplsed (seconds): 1.90442668

To your suprising, the result is exact opposite of yours.

I think you'd better narrow down your phenomenon.
1. Remove all the codes unrelated to String.length(). Measure the effect of "String.length()" not others
2. Optionally, run profiler using "-Xrunhprof:cpu=times" and check which step is the most time consuming job
3. Optionally, collect gc log using "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"

PS) Anyway, i think my test code was flawed. Maybe because of theeffect of hotspot jvm optimization. I had seen some researches thathotspot jvm needs almost 10 minutes running time to reach its maximumperformance. It's called "warming up" time, and the exact warming uptime depends on situation.
UMTS
Posts:3
Registered: 9/21/07
Re: naive for loops runs faster than "optimized"?
Sep 24, 2007 3:48 AM (reply 4 of 4)

Seems I got a kind of clue...
I modified the code to run the same piece more times...
public class TestA {
public static void main(String[] args) {
r();
r();
r();
r();
r();
r();
}
public static void r() {
String s = "";
for(int j = 0; j<10000; j++) {
s = s + j;
}
long t1 = System.nanoTime();
long r = 0;
for(int j = 0; j<10000; j++) {
for(int i = 0; i < s.length(); i++) {
r = r + s.charAt(i);
}
}
long t2 = System.nanoTime();
System.out.println("A: value = " + r + ", elaplsed (seconds): " + (t2-t1)/1000000000.0);
}
}

public class TestB {
public static void main(String[] args) {
r();
r();
r();
r();
r();
r();
}
public static void r() {
String s = "";
for(int j = 0; j<10000; j++) {
s = s + j;
}
int l = s.length();
long t1 = System.nanoTime();
long r = 0;
for(int j = 0; j<10000; j++) {
for(int i = 0; i < l; i++) {
r = r + s.charAt(i);
}
}
long t2 = System.nanoTime();
System.out.println("B: value = " + r + ", elaplsed (seconds): " + (t2-t1)/1000000000.0);
}
}
and I got
C:alexprototypestringbin>java TestA
A: value = 20467200000, elaplsed (seconds): 4.029689249
A: value = 20467200000, elaplsed (seconds): 4.058129455
A: value = 20467200000, elaplsed (seconds): 1.603812775
A: value = 20467200000, elaplsed (seconds): 1.603092293
A: value = 20467200000, elaplsed (seconds): 1.605882033
A: value = 20467200000, elaplsed (seconds): {color:#ff0000}1.609466566

C:alexprototypestringbin>java TestB
B: value = 20467200000, elaplsed (seconds): 5.702802171
B: value = 20467200000, elaplsed (seconds): 5.5870017
B: value = 20467200000, elaplsed (seconds): 1.608010236
B: value = 20467200000, elaplsed (seconds): 1.594163784
B: value = 20467200000, elaplsed (seconds): 1.595716215
B: value = 20467200000, elaplsed (seconds): {color:#ff0000}1.595248558

C:alexprototypestringbin>

Now we see that eventually B is faster as everyone would expect...
Butby some reasons optimization of that code tooks a bit longer time thanA... which is strange but understandable (may be optimizer "tried" somemore variants of optimization)...

What strange is that your laptop faster than mine and showed expected behaviour from the very beginning.
I own Toshiba P105-6114, 1GB processor T5500 (dual core), what isyours? May be having 2 processors slows optimization down for a while?

Concerning naive versus optimezed I dare say that effectof optimization is close to 0, and seems I know why... String is afinal class more over it is a class with well-known behaviour:optimizer could asume that length() always produce the same value andthere is no reason to "invoke" it as many times as specified -- it isjust possilbe to call once and "cache" the result...
What do you think?



신고
tags : ,
Trackback 0 : Comment 0

Write a comment

티스토리 툴바