Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Handouts(OCt 30th GNHLUG meeting): Part 2



From: Jon "Maddog" Hall <hall at zk3.dec.com>
-------------
In steady state, script interpreters are going to have to be dynamically linked
to the browser.  The interaction will go something like this:

[    w   YOU HAVE ACCESSED A PAGE CONTAINING VBscript.
   u i        YOU DO NOT HAVE the VBscript INTERPRETER.
 p p n        SHALL I DOWNLOAD A FREE VBscript INTERPRETER FROM Microsoft?
 o   d
 p   o           <yes> <no>
     W
]



<html>
<head>
<h3> JavaScript Calculator </h3>
<script language="JavaScript">
// FILE:     js-calculator.html
// PURPOSE:  demonstration of java script (as contrasted to java)
// MODS:     mckeeman at zko.dec.com -- 96.01.10 -- original

function typein(form, d) {
    form.acc.value += d
}

function compute(form) {
    form.acc.value = eval(form.acc.value)
}

function clear(form) {
    form.acc.value = ""
}
pi=Math.PI
</script>
</head>

<body>
<form name=calculator>
This is an example of a page with an embedded JavaScript.  If your
browser does not support JavaScript, all you can do is view source
to see what it would do.<br>
<script language="JavaScript">
<!-- hide this from browsers that do not have JavaScript
document.write("Since you <b>do</b> have <em>JavaScript inside</em>, this page 
is live.");
// -->
</script>
<p>

<input type="button" value=" 1 "  onClick="typein(this.form, '1')">
<input type="button" value=" 2 "  onClick="typein(this.form, '2')">
<input type="button" value=" 3 "  onClick="typein(this.form, '3')">
<input type="button" value=" +  "  onClick="typein(this.form, '+')">
<b>Directions:</b> press keys with the mouse
<br>
<input type="button" value=" 4 "  onClick="typein(this.form, '4')">
<input type="button" value=" 5 "  onClick="typein(this.form, '5')">
<input type="button" value=" 6 "  onClick="typein(this.form, '6')">
<input type="button" value=" -  " onClick="typein(this.form, '-')">
or edit the expression in the accumulator.
<br>
<input type="button" value=" 7 "  onClick="typein(this.form, '7')">
<input type="button" value=" 8 "  onClick="typein(this.form, '8')">
<input type="button" value=" 9 "  onClick="typein(this.form, '9')">
<input type="button" value=" x  "  onClick="typein(this.form, '*')">
<br>
<input type="button" value=" 0 "  onClick="typein(this.form, '0')">
<input type="button" value=" ( " onClick="typein(this.form, '(')">
<input type="button" value="  ) " onClick="typein(this.form, ')')">
<input type="button" value="  /  " onClick="typein(this.form, '/')">
The "Clear" key will reset the calculator.
<br>
<input type="button" value=" Clear " onClick="clear(this.form)">
<input type="button" value="  . "        onClick="typein(this.form, '.')">
<input type="button" value=" =  "        onClick="compute(this.form)">
The "=" key will evaluate the input.
<p>

accumulator<br>
<input type="text" name="acc" size=50><br>
<p>
</form>
</body>
<a href="/pub/Signatures/bill-mckeeman.html">mckeeman at zko.dec.com</a>
</html>




                               javac
                  ----------------------------------

A Java class is compiled into Java byte codes (If file x.java contains the
definition of class X, compilation yields object file X.class, consisting of
object file tables and Java interpreter byte codes).  It is the .class files
that are downloaded as applets from the WWW server to the WWW client browser.
If the file is named X.java, and X.class is needed by some other compilation,
javac will quietly compile X.java too.  In essence, javac has a built-in make
capability.  This is not gratuitous; java files may depend on each  other,
making it impossible to choose a correct order of compilation -- so javac
does both at once.

The compiler javac distributed by SUN makes the byte codes.  Anyone else could
implement such a beast.  It is much easier than a C++ or C compiler.  SUN would
be happy if nobody else implemented a javac so as to avoid the possibility of
introducing nonstandard or nonportable constructs into the language.  There are
big players out there that might just go ahead and extend Java against SUN's
wishes.  You can call it Java if (1) you sign a license with SUN and (2) pass
the conformance suite.

As one might expect there are bugs in javac.  Since it has a compilation model
that lets it quietly compile anything it needs, the user sometimes thinks there
is a bug when javac merely accessed a stale file somewhere on CLASSPATH.  But I
have caused javac to throw an exception and collapse in a rubble of bits too. 
You can see the recursion in the unwind message from the exception.

It would be relatively easy to reverse engineer down-loaded byte codes to
rediscover a good approximation to the original Java source for any class
(using a decompiler).  That is, as an ANDF (architecture Neutral Distribution
Format), Java byte codes fail to protect proprietary software sent to the
client side for execution.  Most businesses are therefore giving away the
client-side software.  SUN is distributing its software with the names of
methods and fields scrambled.  Sort of a cheap C Shroud (tm, Gimpel Software).

Some of the inherent protections in the Java language are no pointer arithmetic
and bounding of all memory access.  There is therefore no aliasing of storage. 
There also may be public key signatures to verify that the downloaded applet is
from a trusted supplier.  Most all the cute tricks we hackers are known to play
to break into your system are prevented by the language specification and
implementation.  This is absolutely essential because applets from outer space
execute with your privileges at the click of a button.

Furthermore, each downloaded applet is checked for integrity by the Java
interpreter (or byte code compiler).  The outcome of the check is an assurance
that even the "machine language" of the interpreter cannot do anything bad to
you.  There is no proof, however, that the check is complete.

There is a reason (beside stealing code) to want the decompiler mentioned
above.  One way to reassure yourself that the downloaded Java byte codes are
not doing bad things is to inspect the Java source before it runs.  But the
only sure way to get the source is client-side decompilation.

Java is inherently portable.  The arithmetic types are completely defined.  For
example, type int is 32 bits, long is 64.  Operands are evaluated
left-to-right.  The "as if" rule applies.  Arithmetic is IEEE.  Character input
is UTF8 unicode; if the Kanjii "rice paddy" symbol is desired in an identifier,
it can be used.  And so on.  There are few "implementation defined" or
"undefined" behaviors.  One has to do with badly synchronized threads for 
which nobody currently has a technical solution.

There is a question of what to do if Java is not quite right for you.  There is
a enormous stake at SUN in preserving Java as a standard.  But there is in fact
no reason not to extend Java yourself so long as you can transform your
extension into Java source or valid Java byte codes.  The people who execute
your programs will never know you "cheated".  IEEE and ISO have been looking
into a Java standards effort.  You too can get in on the fun by subscribing to
SC22JSG-request at dkuug.dk.


                           execution
           ------------------------------------------------

The Java byte codes are interpreted by the Java virtual machine (JVM). It is a
little stack computer (like you all wrote in school) which grabs a byte at a
time and then switches on it for interpretation. The JVM also handles calls to
native methods (compiled C), garbage collection, threads and exceptions.  There
are six standard class libraries (in the JDK 1.0, there may be as many as 10 in
the JDK 1.n).  Fast interpretation is an important sales point for
workstations.  As a consequence the workstation vendors are taking
responsibility for delivering highly tuned versions of the JVM along with their
operating systems.  This requires a license from SUN which most vendors have
already signed.


                   on-the-fly compilation   (OTF)
                   just-in-time compilation (JIT)
                   ------------------------------

The JVM is too slow for many applications.  It is fast enough for some
interactive applets.  But so are the scripting languages (see below).  It is
technically feasible to compile Java byte codes into machine code.  Since
portability comes from using byte codes as an ANDF, applet compilation is best
done on the client side.  This has spawned an nearly frantic rush to implement
fast compilers alongside the JVM. There are two related forms of compilation. 
If compilation is delayed until the Java class is needed, it is called
on-the-fly (OTF).  If the need for compilation is anticipated and started soon
enough so that the executable is ready when needed, it is called just-in-time
(JIT). In fact all current compiler efforts are using the JIT description
whether or not it is true.  Because the user is waiting for the compilation to
finish, it is unlikely that much optimization will be done in JIT compilers.
If compiled Java is cached on an organization-wide basis, then the tradeoffs
for and against optimization change again (see below).

It is not yet clear whether the vendors will choose to support a mixed
interpretive/compiled model or just compile everything.  The behavior of the
Java code must be the same regardless of the choice.  Since the interpreter
does some things in a way that is not usual for compiled code, the compilers
are forced to implement the interpreted behavior.  In particular, linking and
initialization are interwoven with execution.  With threads it seems very
likely that differences (bugs?) will show up when perfectly happy interpreted
programs are compiled, changing all the timing relationships.

There is a question of how to make the JIT available.  The point of attachment
could be the JVM (after all, it is the JVM that must switch between interpreted
and compiled code) or it could be the browser (after all, it is the browser
that is actually running on your desktop).  The industry is rapidly converging
on delivery of JIT with the operating system.


                      conventional compilation
                      ------------------------

Conventional compilers will not be hard to write.  A few are already beginning
to show up on the net.  guavac is a gnu-based compiler.  There is a project to
produce cleanroom versions of all the Java related tools.  All platform vendors
will need to insure availability of a conventional Java compiler on all 
platforms, build or buy.

The Java language is LALR(1).  yacc likes the grammar. There are 103 terminals,
155 nonterminals and 327 grammar rules.  The UTF8 input defeats lex however. 
So yacc needs to be mated to a custom lexer.  Wanting to write all of this in
C++ forces one to deal with yacc output which is most decidely C.   Sun has
a yacc-like program written in Java for Java.

The issue of a backend is clouded by the fact that Java cannot use the most
sophisticated C and Fortran optimizations (for matrices) and needs some kinds
of optimization that no current optimizer supplies.  The opportunities for
optimization include eliminating checks avoiding null dereference and subscript
out of bounds, avoiding use of the heap when a stack discipline would suffice,
providing efficient method and field access, and post-link cross class
inlining.  This list will give compiler groups something to do for the next few
years.

Because of the response time requirement for JIT compilation, an optimized
class might not be the best solution.  It is possible that first execution is
interpreted while a JIT is preparing a reasonably fast executable.  One can
imagine yet another layer where the non-optimized code is used until the
optimized class is available for loading.  With corporate-wide caching, only
the first user would have to wait for the compiler.  Other users could, for the
cost of an intranet transfer, load the cached version.


                      Java Class Libraries
            ----------------------------------------

There are six standard Java Class Libraries (in the JDK 1.0).
    java.applet
    java.awt
    java.io
    java.lang
    java.net
    java.util

The Java source for the libraries is about an inch of double-sided printout. 
The content of the libraries is about what you are used to in C or C++.  The
awt package (Abstract Windows Toolkit) is a lowest-common denominator interface
for Win, X and MacIntosh.  The lang package contains support for threads and
exceptions as well a math routines and the like.  The util package contains
date and time, hashtables, and so on.  The io package contains more than a
dozen classes devoted to catching exceptions of one kind or another.  There is
considerable grumbling about the awt library.  It is likely to be redesigned
and rewritten by SUN and others.  Microsoft has completely rewritten awt for
WIN95 and WINNT.  SUN will deliver the Microsoft-written code with its tools
for the MSDOS platforms. 

What does not appear in the inch-thick class library printout is the native
methods.  That is, where the authors have left Java and gone into C to
implement low-level functions.  This includes things hard, or impossible to do
in Java and also things not worth the trouble to recode, such as sin and cos,
and also things that have to run fast.  There are more than 100 such native
methods, which have to be ported separately to each platform.  A large number
of them are collected into package System.java, where java connects up to the
underlying run time system of the host platform.  Each vendor is likely to take
responsibility for tuning the native methods on their own platforms.  This is
a nontrivial and detailed issue in licensing.

Presumably the number of available Java class libraries will grow over time. 
In some sense this will be the measure of Java's success because it will
measure the industry investment in Java.  It is almost never the case these
days that an application is written from scratch -- rather an application is a
cobble-together of existing library functions to do something unique. 
Graphics.  Windows.  Math.  IO.  This is practical reuse.  This also implies
that the skill of a developer in Java is directly related to a comprehensive
knowledge of existing class libraries.  Dozens of specialized class libraries
have been announced by various 3rd parties.  One of the most interesting is
JGL, which is the functionality of the C++ Standard Template Library done in
Java.  SUN will ship JGL with its next version of Java.

Of course, using a class library that is not resident on the client requires
that each class be downloaded and compiled.  This can be a serious impediment
to use, particularly with the load on the internet passing crisis levels.  It
is clear that some kind of caching is going to be necessary to offload
repetitive web requests from the internet and repetitive compiles.  There is
a startup, Marimba, that has some products in this area.

One of the best ways to learn Java is to read the class libraries. There are
plenty of examples of almost everything there, written by good Java
programmers.


                       browsers & HTML
            --------------------------------------------

There are several HTML browsers available.  NetScape has about 50% of the
market according to some sources.  But there is Mosaic,  HotJava from SUN,
Internet Explorer from Microsoft (now also 50%) and some other minor players.

HTML itself is undergoing rapid change.  NetScape 3.0 is supporting fields,
which allow breaking the browser page into independent screenlets, each with
scroll bars.  And tables, and who knows what else.  It changes with each
release.  The WWW consortium is the best (only) control on markup languages. 
VRML has been in the works for some time.  It supports 3D graphics, probably as
a layer on some standard graphics package.  Whether this kind of stuff shows up
in a markup language or as a Java class is still being contested.  The first
implementation of VRML has arrived: it is entirely written in Java.

SUN has hinted that with Java you do not need HTML.  Just compute the screen
layout.  At the moment that does not make sense because it is the browsers that
have the network capability.  But it is true that one can duplicate any gui
behavior from Java -- a browser within a browser.  Everything depends on who is
the master.  I cannot imagine SUN coming out with an HTML-less browser since it
would not handle legacy pages.  But I can imagine SUN writing the HTML
interpreter in Java so that it can be downloaded if necessary, then writing a
browser that runs without HTML until it is needed.  Over time HTML might fade
away under such a scenario.  What SUN has actually done is break up their
HotJava brower into a set of class libraries that allow users to make custom
browers.   HotJava is now just an example of the use of the class libraries. 
Presumably there will be more browsers, each specialized to some particular
use, such as internet shopping.

As mentioned above, presumably each browser will supply an API that allows the
script interpreter du jour to be attached.  This is in contrast to the original
situation where NetScape seemed to be linking everything into the browser. 
NetScape got much bigger with each release. (It is 7.3 meg on my alpha Unix
box, 2.5 meg on my intel NT box).  In fact, NetScape is heading for a general
API which could attach anything the client had available (see netwrap below). 
SUN has claimed that browsers ought not to even have the network protocols
built in, but rather loaded as Java applets. 

                      Developer's Tools
         --------------------------------------------

Programming in Java using the JDK (Java Development Kit) is about like using C
or C++: lots of cut and try.  Lots of command-line typing.  There is a
primitive debugger.  There is a fake browser (called the applet viewer) which
can execute applets for you.  Version control, drag and drop composition, and a
class browser are on the way from SUN.  OSF has been asked by SUN to make a
multiplatform version of the JDK/Java Workshop.  See authoring tools below for
more details.

The JDK is a command-line oriented, PATH variable sensitive, development
environment for Java.  It is conceptually a tool of the 70's.  SUN's plan is to
improved debugging, add drag and drop, version control, project management, web
page management and class browsing to its JDK, and call it the Java Workshop. 
There is an alpha release of Java Workshop, based on the HotJava browser (for
SUN and WIN95).  It is very pretty, pretty slow, and plenty buggy.  Later
versions are reported to be better, but writing in Java has some inherent
inefficiencies.

By way of contrast, Visual Basic has a drag-and-drop authoring environment that
is years ahead of the JDK and perhaps even the Java Workshop.  What has kept
Java alive is its inherent security and nerdness, which Visual Basic lacks. 
Bill Gates is trying to do something about this so that Java is not the only
language in which WWW content can be provided.

On the other hand nearly everyone is shouting about their Java development
tools.  Borland has Latte, Rogue Wave has something call J-factory, a
Nashua-based spinoff of Digital has ported an interesting development
environment to the alpha boxes, Asymetrix, also staffed by ex-Digital compiler
people, has an integrated C/C++/Java fast turnaround environment, Semantic has
Cafe.  and so on.  The candidates show up at the rate of several a week. 
NetScape Gold itself has built-in authoring tools for HTML.  You can also use
NetScape with secret flags to access the JVM, which, when pointed at the javac
bytes codes, provides a java compiler.  Symantec has apparently captured market
share with its development environment.  It is pretty slick.

It seems obvious that WWW servers, competing for your business, will want to
make page authoring easy.  As a consequence, the server is likely to provide
downloadable authoring tools with an [INSTALL] button to cause your freshly
crafted page to be uploaded and installed on the server.  It makes writing WWW
pages a one-stop shopping expedition.  The alternatives are logging on to the
server and using the JDK (which OSF ported to alpha Unix and other platforms),
or doing the authoring somewhere else (your desktop) and then having a separate
installation protocol with the server.  Again, because downloaded authoring
tools are possible, someone will do them.


                      testing and conformance
           ------------------------------------------------
           
Testing any nontrival Java program is difficult.  It usually has a gui and
usually has threads.  Building an automated test suite is even harder.  Add to
this the vastly different timing of compiled and interpreted code, and  you
have a prescription for a research task.  The most likely outcome will be a
large number of tests that run one way on somebody's interpreter and a
different way on somebody else' JIT.  Tossing the test out is not a good
solution, but getting good comprehensive tests is going to be a massive effort.

On the other hand, every licensee is going to have to pass the conformance
suite within 90 days of a change announced by SUN.

SUN has also announced a suite of testing tools (for about $3K/seat).  

                             netwrap
          --------------------------------------------------
 
Could Microsoft Word be downloaded as java classes?  The answer is yes, if
there is client-side compiler.  No wonder Microsoft is worried.  It is vastly
easier to shop on the net, and the product that comes down the wire is vastly
cheaper to deliver.  That is, no distribution channel, no box and floppies, no
stale software to buy back from the distributor.  Furthermore, it is well known
that most users use a very small part of their software.  Fine.  The unused
applets will just never be downloaded at all.  With a little cleverness in
design, the local copies of the applets can be kept up to date, replacing stale
applets with shiny new ones automatically, obviating all the version upgrade
nonsense.  This set of capabilities is beginning to show up in announcements
but I have not used any yet.  There is a Claris demo of an Office suite that
you can try out.

The problems of charging for software are being worked out by the retail
industry.  You can shop LL Bean on the net.  There are startups with patented
gadgets to make such shopping worry free (except you might overspend your
credit card).

One can imagine a world where the end user never leaves the WWW browser.  All
the needed software, from spread sheets to data base access can come over the
net.  This brings up the question of exactly what piece of hardware can support
this usage.  The TV settop device industry is the first attempt to leverage the
low net software delivery costs to make a low-cost client, using the TV set for
display and your favorite TV channel clicker as a UI.  Well, TV screen
definition is going to have to get better before I will replace my PC with TV. 
But HDTV is on the way, so they say.  Alternatively, a high quality computer
display can be used.  And bandwidth down the cable is not a problem, which 
will solve one of the ugliest truths of the WWW today -- that is it is out of
bandwidth and usage is increasing.

Several companies, including SUN, have announced low cost network computers.
SUN uses their new java chip set which directly executes the Java byte codes.
Others will use cheap RISC chips such as ARM and STRONGARM.  Conceptually, the
"network is the computer".  Just as the PC burrowed in beneath the mini, the
network computer can burrow in beneath the PC.  SUN is rather obviously
gloating at the challenge Java has presented to otherwise dominant Microsoft.


                           killer apps
       -----------------------------------------------------

It makes sense to speculate on what can be written in Java to make money.
Downloaded authoring tools is one incestuous example.  Local system management
is another.  That is, the downloaded applet could do backups, run diagnostics,
upgrade software, and so on for some local group.

First idea gets rich.







BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org