self-hosting and extensibility

送交者: AA 于 2005-9-08, 22:41:25:

last time I talked about header file or not. I guess I caught only a tiny part of the issue, it's not a small topic anyway. This time I want to talk in line of it, but with a different point of view.

Many people heard the idea of self-hosting, that is, to host a system with the system itself, e.g., to write a C compiler in C, or to build a game that can play game. The idea looks weird at first glance, or impossible, since presumbily a man could never understand how he understands things. This is philosophy, but in my understanding, was proved by Godel years ago. I am not going to talk about that, that's too distant from the reality of our lives.

My interest in it is to build a system that is extensible. For example, you have an OS, it runs some applications; then these applications act as part of the OS, it may replace some existing component of the OS, or just extends the functionalities. Extensibility is desirable for any people. One obvious advantage is you can live upgrade you system without reinstallation.

This is not difficult actually. People do this for years. For example, LISP. In a LISP console, you type or load in your program. You program will be part of the LISP runtime system for you to write new code based on. That means, it's very easy to have an extensible LISP system. The extended system is hosted by the prior one. So if your new program is a LISP runtime system, this new system is hosted by the old LISP system. That is, self hosted. You can build the base LISP system very small that has only the basic rules of lambda calculus and an interpreter to execute the rules. ---- The very base system of course is not necessarily written in LISP, assuming there was no LISP existing before your system.

JVM (Java virtual machine) is also possibly built with Java, so that it is well extensible. But it's much more difficult to achieve compared to LISP. Just imagine you load a Java application in your JVM, say Sun JDK 1.5, it can't be part of the JDK. At most it can be used as a library by other Java applications, it's not able to be used as part of the JVM. For example, you write a new memory management module (garbage collector in Java terminology) in Java, you can no way to let the SUN JDK use it for memory management in JVM. This requires new design to enable the extensibility. Fortunately it's not that hard in Java, because Java has full support of reflection. Reflection is a functionality that an entity (Java class) provides information describing itself. With this support, it's possible for the JVM to query your new Java application about its properties, e.g., what's your name? how many fields do you have? how can I call your functions? etc. So it's still possible to build a extensible self-hosting JVM.

Then how about UNIX? is it possible to have an extensible Unix? My answer is No, because of the implementations in C language. C doesn't provide any reflection at any level. Yes, you can build C libraries to support reflection, but that's lame at the begining. To circumvent the language inability, we can build object system with C, that supports reflection at object level. Right, people did this in a higher level than object, and call it component. Corba and COM have it. You can query a component about its properties as you do with a Java object. With components, you can integrate new applications with existing system easily.

Well, let's move on to have a look as web services. What web services do is actually the same thing, but at the service level, so that a new web service in the Internet can be easily integrated into the existing system.

Ok, the discussion on self-hosting and extensibility is almost done. This is kind of mental exercise. Can this help me anything on my real work? Yes, I did find it is helpful. It leads me to think why extensibility is going harder from LISP to Java to C to web. I tried to explain this phenomenon in line with my previous discussion on header file.

LISP is the easiest one simply because it has no interface issue. The program is interpreted, that means that the text of the program, or the code/data describes everything about itself. It reflects at code level, at every token level. Java, one the other hand, reflects at object level. Then CORBA reflects at component level. Finally web services reflect at service level. C, well sucking as it is, can't tell anything by the code itself. Header files help here providing a description, but only at compile time.

If you observe carefully the order of reflection level in the techniques I talked above, you may find the execution efficiency is in the reverse order. The lower level of reflection support, the lower its performance is. This is understandable, because runtime reflection support means runtime overhead. Well the low performance is not inherent to the techniques because the runtime optimizer can play its tricks here.

At the same time, the reflection levels can be used in hybrid in one system. For example, a Java application can use component technique to talk with another Java applicaiton with RMI, and their resulted application can talk with other webs through web services. In this way, it is possible to build a adaptable Internet for human beings. It is still a long way to go, since neither Java, nor RMI, web services are designed for this goal. Or is it only a daydream?

所有跟贴:

well said - keyboard (49 bytes) 2005-9-11, 20:55:24
- People did that. Well that didn't make much sense to have it - AA (301 bytes) 2005-9-12, 08:05:28
  - Why is that? - keyboard (140 bytes) 2005-9-12, 21:36:53
    - It is always Performance/Money. - wangle (97 bytes) 2005-9-21, 23:00:51
    - I guess the performance is ok now with your approach 1 - AA (2413 bytes) 2005-9-14, 01:20:22

加跟贴

所有跟贴·加跟贴·新语丝科技论坛