论坛徽章:: 0

电梯直达

1楼 [收藏(0)] [报告]

发表于 2007-08-09 10:17 |只看该作者 |倒序浏览

视频入口地址在
http://go.techtarget.com/r/1950475/6108168
，需要注册。
下面是对话录：
Kirk Pepperdine (Moderator): We are sitting here with John Davies from IONA and Gil Tene from Azul Systems, and we are here to talk about the Azul experience. We will start with you Gil. Can tell us a bit more about Azul and yourself?
Gil Tene: Azul is a young company with an exciting product out on the market for the Java enterprise environment. We make a device that powers Java Virtual Machines on existing servers: Linux, Solaris, HP-UX, and AIX, and allows Java applications to run literally 100 times as large, 100 times as fast.
Kirk Pepperdine: So is this like a board running on a slot in my Linux box; how does this actually fit together?
Gil Tene: Well, the device itself sits on the network. It is purpose-built. It is intended to power Java Virtual Machines to huge scales and eliminate a lot of problems like garbage collection pauses and scalability challenges. We provide a Java Virtual Machine for servers that sit on the network and when those servers execute that Java Virtual Machine they tap the power of the device and provide scalability for their applications.
Kirk Pepperdine: So, you are running a virtual machine on that box and how am I able to communicate with it from my Linux machine there, like making network calls or something?
Gil Tene: The virtual machine is installed just like any virtual machine would in a normal server. Once you execute it what is really being executed is a small footprint proxy. That proxy locates one of the appliances over the network, puts the virtual machine code in the appliance and executes it there, and the entire application as a whole runs within the Azul appliance. The application running within the appliance uses the proxy to communicate with the world around it whenever there is any I/O, any file I/O or networking packets being thrown around or system calls. So, the application residing within the virtual machine, be it custom Java app or WebLogic or JBoss, Tomcat, GigaSpaces, Tangosol, any of those containers would basically believe it is still running on the same server that invoked the Java Virtual Machine, so would everything around that server. The server would just appear to be a lot more powerful and the application will scale much more smoothly on it and that is really where the experience with Azul starts.
Kirk Pepperdine: So, John, you are at the opposite end of the spectrum here -- the consumer of, or at least helping the consumers of this hardware utilize it. What type of problems do you see this hardware solving? What does the customer look like?
John Davies: Speaking from a banking perspective from investment banks we have got a lot of very seriously high volumes where we are processing massive volumes of data with even sub-millisecond but certainly low millisecond reaction time where it is scalable if you use a lot of technologies and you can scale it after using Tangosol, GigaSpaces, GemStone, etc. into the network. You get good scalability but you take a massive performance hit as you hit the network and you can still scale up, but if you can maintain everything inside the same virtual machine you do not suffer that performance hit. The problem is that you hit the garbage collection problems, and you hit scalability problems quite physically because you just have a certain virtual memory size that you just cannot get past and you had to use a twin-core, quad-core, etc., and dual-core, and you can get a 16, 32 GB of RAM these days. So, you can get a lot into those, you get lot of performance, but you still hit a barrier. When you have to go past that barrier, you then choose to distribute and you need a lot more processing power if you have to regain what you have lost in the distribution. Using something like an Azul box, you can just go on 10, 20, 30 times bigger and get heaps in sizes of, what, gigabytes.
Gil Tene: 300 GBs.
John Davies: Yeah, several hundred GBs and, let us say, multi-core in the 2, 4, 12 into the 10’s, 100’s, 100’s of cores on to 100’s of gigabytes. It gets quite impressive what you can do on these things.
Kirk Pepperdine: So, what is the biggest machine you have actually deployed then?
Gil Tene: The biggest one we have in deployment today runs a 320 GB heap on an application that saturates more than 700 cores. It is one machine in one VM that runs in that entire space. That same machine could also be used to consolidate 200 different applications each needing as little as 3 GBs of memory, and put together multiple of these machines could address an entire data center’s needs.
Kirk Pepperdine: So, it must be very difficult then for people who write applications to actually utilize all of these cores, all the memory as a starting issue and do you see that as a problem, John?
John Davies: It was amusing as we tested the C24-IO (Integration Objects) which now is of course the IONA’s data services, and we put this on to the Azul boxes, and of course, you have to write multithreaded and we just assumed that you’d never have more than probably 16 cores in a machine, so we started with 16 cores and we would work our way backwards to try and calculate optimal performance like in the JIT compiler. Of course, when we put it on these guys’ boxes, we had to start thinking completely differently putting in sort of 200 threads and we then had a few little problems, mostly with third party libraries just not scaling even as we had good hardware, it is a pretty impressive hardware, debugging stuff which allowed us to sort of sort out these problems. It was fairly quick, but you have got to really sort of scale things out with the multithreaded side of things.
Gil Tene: I think that the tools that we have are actually one of the things that really help people achieve scale fairly quickly. We would normally, when you place an application on Azul reach some natural limitation of scaling. Our general experiences is in the 10s of course usually 40 or 50, I do not know when did you hit your first actual bottleneck that you had to go and look for?
John Davies: The first one we noticed very quickly in a third party library out of fairness. That I won’t mention and it hit pretty quickly and did not scale a tool beyond that. That was fixed and because it was third party we had to send your debugging output from that we then sent off to the owner of that and that was fixed. We then hit some sort of stranger problems as you say around this with 60, 70 cores, but the performance was totally linear until we got to that, and then we had some strange behavior, then it picked up again, and we have got linear performance fixing the problems, changing some of the, the libraries, as we did things, we got incredible linearity out of performance.
Kirk Pepperdine: And what was the total time, if you go from the beginning until you got the true linear performance…
John Davies: In elapsed time it was days. In actual time put into it as usual, you do these things in almost sort of, I do not know it was hours really it was…
Gil Tene: And that is usually what we see is that there will be some sort of bottlenecks to run into that is why we have very good tools to analyze what is going on in the VM, and to identify the scaling bottlenecks so that you could easily push past them. Our experience usually is that within a day or two of installing the appliance you will get some level of scalability, analyze it with some tools to figure out what the tune or tweak sometimes change some things in code, and then we do get that application to scream on the device.
Kirk Pepperdine: Yeah. So, I mean, I guess people will be testing their applications in traditional environments and they move on to the hardware they are bound to find a lot of bottlenecks in their applications. What are the typical bottlenecks that people run into?
Gil Tene: I would say that, I classify those as two parts. There is the bottlenecks in applications that happen before we show up, those usually have to do with not being able to scale the heap past a gigabyte or two without experiencing significant pauses, with not being able to keep more than a handful of CPUs busy without pushing the memory into a point where significant thrashing or pauses exists. Usually when we show up, the first thing that we do is, use a larger heap or create a pauseless heap situation so that the application can smoothly scale past that. The next level that we often see is that the application will need some tuning. It is typically tuned as you said to aim at machines with a handful of cores, 16 cores seems like the large so the thread pools are going to be sized small, the connections pools will be sized small. Usually, these are things that are configurable in a lot of application platforms, but sometimes you need changes in code if it is an embedded configuration.
Kirk Pepperdine: Yeah well, the thread pooling is used more as a throttle neck, so that you do not swamp the hardware. So, this is the situation where you just have to open up the gates there in that case.
Gil Tene: That is that first phase and usually that happens fairly quickly. The next level that we run into are usually contention for resources, things like queues that bottleneck were doing things around striping the queues or creating larger work units on the queues helps alleviate the contention. Sometimes, you find people who are contending for external resources like files, and doing things like blocking around logging, things of that sort where tweaking or tuning around the use of logging is one of the other things that we often see as an early exercise. But I think the experience that John had and many of our customers within days of using the tools to analyze, see where the bottleneck are and push forward. Usually, you get past them into some very significant scalability and an interesting other side effect we’ve heard from a lot of people is the application starts running better even on a native platform because we found the bottlenecks there. So, we had developers point out that they would like this in development and QA just to find normal bottlenecks in normal systems because it finds them faster and the analysis tools are great.
Kirk Pepperdine: Those are the couple of points that you touched on and there is the phrase that sort of slipped by that tweaked my ear with pauseless heap, pauseless heap situation. Can you explain a little bit more about that’s the garbage collection package?
Gil Tene: One of the hardware features and one of the core things in the design of the entire appliance deals with allowing Java Virtual Machines to scale in memory in its CPUs without garbage collection being a limitation. We believe it is actually the number one limitation to scaling almost all Java Virtual Machines and even on machines where you can get 16 and 32 GB of memory physically in the machine, you are still unable to use more than a couple of those gigabytes in one virtual machine because of this limitation. Our hardware is able to do a fundamental thing, which is detect the attempts of using references to relocated objects. With that hardware assist we are able to build a fairly simple garbage collector that can relocate objects and compact the heap on the fly without stopping the application. That is a unique feature to this virtual machine, it is one of the…
Kirk Pepperdine: That’s a very powerful feature too, isn’t it?
Gil Tene: Yep. We do not think it, we do not see it in any other commercial shipping enterprise class VM, and it is really what allows you to scale.
Kirk Pepperdine: I would think the banks would be very interested because GC pause time, it’s very disruptive.
John Davies: It is just the main reason why there are still diehards that stick with CC++ because you can guarantee performance, not necessarily faster, but it just guarantees the performance and we will run something in Java, it will run quite frequently faster on C than it would have done in C/C++, but every now and then it will just pause and that pause can be extremely expensive. You go into the 8, 16, 32 GBs of RAM and that will go in to minutes. Everything times out, it thinks it’s died and things start recycling and it is quite painful and sometimes that can be as light as day. We end up rebooting service every night.
Kirk Pepperdine: So what happens if you let it go for like 24 hours or something like that, often you hear just like super long GC pause will…
John Davies: Yeah, big time. It really just takes it up. I mean if you have seen a graphics application in Java it occasionally just pauses and nothing happens. It is just you get no response from the JVM at all, and that is small in a half gigabyte heap, but when you move into gigabyte 2, 4 etc. it is very, very painful.
Gil Tene: It is exactly the behavior that I think we find as the #1 attractive thing for people at Azul with Azul appliances because as I said before applications in Java tend to work smoothly within the gigabyte envelope, going beyond that starts having these unexplained or mysterious pauses or seizes of application, those are basically garbage collection doing its job in a rare basis and it is exactly that behavior that we eliminate. One of the key reasons people use Azul is to get consistent behavior, smooth execution and not have the fear of some worst case behavior happening in a rare and untested situation.
Kirk Pepperdine: There is some other things, keywords that sort of in that explanation that sort of stuck out. These are the things, like hardware assist. So, Azul actually is not running, it is not the JVM in hardware, it is actually running a Java Virtual Machine especially prepared for the Azul CPU type, yeah?
Gil Tene: Yes.
Kirk Pepperdine: And there is this thing called hardware assist, can you explain like a little bit more about what that is?
Gil Tene: So, at the core of an Azul appliance are chips that Azul makes, very impressive chips with 48 cores of processors each and 16 of the chips together could gang up to create a 276 lightweight SLT. The individual processors were designed to run virtual machines. They are by no means Java execution engines, they are not running bytecodes or anything like that, but they were very much designed to make optimal chip compiler target. With that in mind we have created several assists and instruction set that could help specific patterns that we see in virtual machines, Java and other virtual machines like .NET and others which share those patterns. Garbage collection read and write barriers are one example. The need to place them around reference actions on Java programs is there if you want to have any sort of effective, efficient and pauseless collection. In addition to that, we have hardware assists if you would like around detection of interesting behaviors like Escape Analysis, like non-authenticity detection. Our hardware can literally detect when an optimistic concurrency attempt on thread execution is wrong and allows the virtual machine to then roll back state. That theory…
Kirk Pepperdine: Maybe you can explain this a little deeper than that, like what you are saying is that if there are some mutations occurring you are just going to detect that, roll back state and threads then?
Gil Tene: Yes. This is a combination of hardware and software here, but a critical piece of hardware that allows this to happen. In the industry, and in academia this is termed typically transactional memory. Our hardware is, I believe the first commercial hardware to have this capability and it really allows us take a Java synchronized block or Java synchronized method and execute it optimistically along with other threads that are executing the exact same method there, the exact same block under the same lock. The optimistic execution is possible because the hardware can detect when the optimism is wrong. That detection which is very hard or impossible or near impossible to do in software allows us to then take a synchronized block that is optimistically executing and cannot complete its execution atomically, revert the entire state to the beginning and with the virtual machine completely transparently to the program simply making their actions go away and start them over. That capability translates to us being able to keep a lot more cores busy with the same sort of program because we can run a lot more threads at the same time.
Kirk Pepperdine: How does this work when you have like HotLocks? That is something..do you find that still…HotLocks can still be a problem or do you actually see reductions in contention over hardware that does not have this type of hardware system?
Gil Tene: It is actually I think a yes and yes. HotLocks are always a problem, but HotLocks are problem in Azul when the HotLocks contend on data. When there is actual contention there was a good reason for the lock to exist on a common case. HotLocks around queues, HotLocks around small data structures that are heavily written to would contend in as just as any other, but HotLocks around non-contended data are things that we basically optimize away in execution. What happens in effect is that we can experience fewer lock contentions on the same program because we can optimistically execute contended locks rather than wait and serialize on them.
Kirk Pepperdine: John, do you find your customers are actually noticing that the contention is, lock contention is somewhat less of an issue in your applications?
John Davies: I think we certainly noticed, we are certainly seeing people notice the improvements that are being made having had the systems with hardware debugging if you like. A lot of this stuff as you all know, Kirk…if you are trying to debug it one of the symptoms is as soon as you put the debugger on it, it disappears and goes away because you are slowing everything down, and this is one of the things when we start taking this stuff off in immense performance situations we have locks on memory and there is something going wrong, but the fact that they can actually pinpoint this in hardware and give us some reports of where these contentions are we can improve those. We can make changes to the software. It is improving not just the performance on Azul box, but outside on standard hardware as well.
Kirk Pepperdine: What are the other comments you are getting from like customers in general when they start seeing their applications run on this hardware?
John Davies: A big smile obviously. It is…like a lot of technology these days. It is that you sometimes go in leaps and bounds and this, this really is a leap. It is good to imagine what several 100 cores does to an application and then just see how fast it goes. To have the availability of that RAM, that amount of RAM available in a Java Virtual Machine, brings people to re-engineering and re-architecting problems which they previously either distributed or because they could not put it into memory would persist it to a database or some sort of disk, in some form of disk and because they are not having to hit disk anymore, you are getting orders of magnitude performance gains from just that alone. We can take an entire day’s equity trades from a very large bank, put it into a single virtual machine and process it in one go.
In the performance if we can do that means that we do not have this same problems with locking, the same problems with systems and if it does not run we can rerun through it again in a fraction of the time that we would have to do it if we were going out to disk.
Kirk Pepperdine: So, you are just releasing a new version of the hardware now right, is this up and released or is it about to be released?
Gil Tene: We just released that two weeks ago.
Kirk Pepperdine: Two weeks ago?
Gil Tene: It is really our largest system to date. It is our 768 way, 768 GB machine and a form factor that is new. The machines are really the largest scale of our Vega 2 machines that we released late last year, but really the largest machines out there and…
Kirk Pepperdine: So what are the improvements over the older version of the hardware?
Gil Tene: Well, we actually are very proud of the rate of improvement that we have in our hardware, our Vega 2 appliances are three times the capacity of the Vega 1 appliances which themselves were pretty big and pretty good. The rate at which we have been able to improve over a fairly short period of time outstrips Moore’s Law and we intend to keep executing very well going in to the future, but when you compare these platforms to existing platforms these machines can be seen either as tremendous consolidation machines allowing you to displace 10s and 100s of servers with a handful of our machines or in the other hand, as John had pointed out, you can actually start engineering new applications which we are seeing a lot of action around where applications that would previously be deemed impractical or impossible holding an entire case trade in memory, holding an entire bank’s positions in memory and doing risk analysis on them, doing things like logic analysis on events, on rates that are beyond what people would typically do in memory, but literally holding 200 and 300GB data sets and analyzing them brings new capabilities to mind. Literally, 100X improvements that come mostly out of elimination of process-to-process or process to file I/O kind of work.
Kirk Pepperdine: You are saying this John that, what are the typical…types of applications that your customers feel now that they, the new problems that they can solve, can you tell me about it?
John Davies: The requirements are far exceeding Moore’s Law. We are getting more and more requirements; the data volumes are just getting absurd. This competition between the banks and we are getting matching reconciliation, these sorts of things with unbelievable figures and the bank only makes money if they can reconcile those or come back with the figures quicker than the other banks and it is just that striving for the best figures and if you can…if you have to go out and do that in the database, they can compete on database side, but if you can do that in memory and do it a 1000 times faster, it is a…
Kirk Pepperdine: So, you really have the rare requirement that this thing has to run as fast as it possibly can.
John Davies: Absolutely. I mean the bank that gets it done fastest is the bank that makes the money.
Kirk Pepperdine: Yeah. This is very unlike other business applications where you can generally fix hard performance requirements then.
John Davies: Yeah. Now, it is just really striving competition and then just getting into that. As soon as they get into that they have been getting to bigger and bigger heap sizes by using very interesting in-memory technologies, which are distributed, but if you can use, you can put that inside a single VM it is very impressive.
Gil Tene: I think there are two areas; one is obviously the competitive banking environment. The other one that we see often is the businesses wanting to bring batch execution into what they call real-time and they can actually make business actions during the day or almost immediately based on analysis that would otherwise have taken overnight or a week to process. The ability to shift their analysis from being batch overnight into taking five minutes for example on huge data sets can completely change what a business can do and not just in the banking industry, in general industry as well.
John Davies: Well, rather than troll that are going into a large database maybe half a terabyte of database, which is not unusual these days, troll through that and spend hours and hours pouring through it, maintaining different isolation levels, etc. where it is quick and loaded up into memory, into one big virtual machine until we troll through it all or work through it in brute force.
Kirk Pepperdine: We are able to sort of eliminate the database from the solution?
Gil Tene: Yeah, database. No, there is no need for it. You just store the transactions that you have completed that you need to complete as an audit trial, but other than that it is of no great use. We can brute force half a terabyte of data far quicker than software databases can do.
John Davies: We are seeing patterns where people in retail or high-end web applications are doing the same pattern. One of our customers is using Azul for handling the back end of hotel reservations across probably half the hotels in the US, which is an in-memory problem for them, and therefore they are able to handle it at the right…so we thought doing the same on a database we consider recharging.
Kirk Pepperdine: So there is just not the banking industry that is interested in this technology. Is there another, many other industries then that you think are benefiting from this technology right now?
Gil Tene: The banking industry is clearly one of the early adopters. We also have some significant footprints with telecoms. British Telecom is one of our publicly announced customers using us in a B-to-B gateway, that is a significant size.
John Davies: An interesting gambling one as well.
Gil Tene: Yes.
John Davies: Online gambling.
Gil Tene: Online gambling. Online gambling looks a lot like banking.
Kirk Pepperdine: They still have the hard real-time requirements.
John Davies: Yeah absolutely. It is to say in second sense that it is bidding, trading in real-time.
Gil Tene: Online retail, the other end of financial institutions like insurance and…
Kirk Pepperdine: So there really is broad industry acceptance for this particular technology then?
Gil Tene: I believe so. In fact, I think that it is following the same kind of adoption trends that J2EE and Java and Enterprise as a whole have shown where it becomes the default deployment model for new applications and in large businesses everywhere. We see the interest around the same kind of trends.
Kirk Pepperdine: So, is there any special programs now that you guys are offering?
Gil Tene: Yes. Actually we have an ongoing program or one that we have just created that where Azul literally will guarantee our improvement in performance and scalability of applications on servers for certain qualified applications. This is our…we have our 5-50 Performance Guarantee program in it we guarantee a 5X improvement in throughput in server utilization, in productivity of servers and as big as a 50X improvement in the caching capability of data.
Kirk Pepperdine: And it says you are 50 times simpler too?
Gil Tene: Yep. Well, simplicity as you mentioned today in the earlier talk I think John, is actually a keyword here. There are probably other ways to try and solve similar problems that involve a lot of complexity, distribution, re-architecture, but it is the situation where you are literally able to eliminate the problems, create a simple large solution, cluster it for availability and simply get rid of the problem that we believe wins with us. It is the simplicity and the speed at which this can be achieved that is important. Usually, our customers will see the benefit within two or three weeks of initial introduction to the hardware which is a very rare thing for a new platform and it really has to do with the deployment model, the way we inject our VMs into existing servers and existing applications and the fact that we do not require re-architecture or any significant change to the software architecture.
Kirk Pepperdine: It is Java at the end of the day.
Gil Tene: Exactly.
John Davies: Well transparent anyway.
Kirk Pepperdine: It is Java at the end of the day. Good. Thank you very much.
Gil Tene: Thank you.
John Davies: Thank you.

本文来自ChinaUnix博客，如果查看原文请点：http://blog.chinaunix.net/u1/45382/showart_356477.html

文库|博客

返回列表

Chinaunix › 论坛 › 程序设计 › Java › Java文档中心 › Improving JVM scalability and performance (视频、对 ...

Improving JVM scalability and performance (视频、对话录) [复制链接]

浏览过的版块