“little brother” Analysis Tools

11/26/2017

The Pharo Smalltalk Environment
To understand how the query, analysis and visualization tools work, a short overview of the Pharo Smalltalk environment in which they run is necessary. The following refers to the code analysis tools written earlier, and on which the infrastructure analysis will be based, but with different models based on infrastructure engineering standards.
Smalltalk is a live object environment. If you’re familiar with other live environments, such as the Emacs LISP console or the Python console, the Smalltalk environment is similar in that arbitrary code can be typed in (in Smalltalk, the area in which you type arbitrary code is not a console, but a text area known as a ‘playground’) and executed immediately without a compile/build cycle. For example, typing ‘MessageFlowBrowser new open’ in the playground and executing it instantiates a MessageFlowBrowser object, which then appears in the workspace and can be used.
Unlike a Python console, where the text is interpreted as a keyword command, Smalltalk has no keywords. ‘MessageFlowBrowser’ is a reference to the class object for the MessageFlowBrowser class, and all class objects are loaded at startup. The new ‘command’ is a message, which in Smalltalk is an arbitrary dynamic object with no fields or methods. The actual method called is ‘initialize’.
This indirection, where the class being called determines how to respond to a message, or a binary message with an object, provides a simpler call for the class’ clients. It also ensures the author of the class determines how the class can be called, rather than the author of the calling class having to figure out from parameter names what specific variables to pass in, which is often difficult to determine without access to the source.
Class and instance objects understand various messages by convention, design or via inheritance. By comparison with other object languages, the base Smalltalk Object and Proto-Object understand a rich set of messages that all Smalltalk objects inherit, including the full set of reflection messages. Classes can have both class variables and methods, and instance variables and methods. All instances of a class have the same value for a variable
Since ‘MessageFlowBrowser’ is a reference to an actual object, it can be highlighted, and its state inspected, its code can be browsed and edited, or refactoring tools can be applied to it with the effect occurring as soon as the change is saved.
This is the case not just with declared classes and messages, but also with literals and operators. ‘42’, if inspected, is an integer object with the value 42, and will respond to any message that integer objects understand, which include the standard mathematical operators such as +, -, *, /, etc., as well as more complex messages such as factorial. Variables in Smalltalk are dynamic, not in the sense of untyped as with JavaScript, in fact they are strongly typed, but in the sense that they are typed when a value is assigned, and allowable casts are performed implicitly.
Thus typing ’42 factorial’ in a playground and executing it gives you a (very large) numerical result, whereas the equivalent in Java will cause an error, since Java will not implicitly cast an integer to a number class large enough to compute the result. If the result is inspected, it is a BigInteger object.
Polymorphism in Smalltalk is not like the method based polymorphism in Java, but refers to polymorphic classes. The ‘become’ message allows safe casts between related types. If the cast is unsafe, the environment throws an exception the moment the new code is saved.
It needs to be kept in mind is that the entire environment is written in Smalltalk. There is no difference between VM code, interpreter code, JIT compiler code, or user code. Apart from code that calls out to external libraries such as OS libraries, the interpreter or JIT compiler converts the Smalltalk code directly to assembler, and since it is written in Smalltalk, the resulting assembler implicitly follows the rules and restrictions of the language.
The usual demarcation of languages into first generation (assembler), second generation (C, Pascal etc., written in assembler), and third or fourth generation (such as Java, written in C and interpreted/compiled by C code, or Groovy, written in Java and precompiled in Java, then interpreted/compiled by C code) is not relevant to Smalltalk. The language, tools, VM, interpreter and JIT compiler are all themselves written in Smalltalk.
Since the bootstrapper is no longer operative in Pharo once the base VM and interpreter classes are loaded, the environment contains all the code necessary and contains it all in the same manner. Thus, new code imported from a repository or written by the developer is as much part of the VM as the VM kernel or interpreter, and any of it can be changed dynamically at runtime.

The Model Environment
Within the tools, a model is a representation of the target code base in FAMIX format. FAMIX models can be generated from various code bases, currently the tools support Smalltalk (unsurprisingly), Java, JavaScript, LISP, Python, SQL and ABAP. Prior to release I plan to support a few other JVM based languages (since converting the respective Eclipse IDE model to FAMIX is relatively trivial), including Scala, Clojure and Ceylon.
The FAMIX model, which is stored as a text file with the extension ‘mse’, is imported into the Smalltalk environment, creating model objects based on the FAMIX meta-model. The resulting model is displayed via a UI toolkit designed for developer tools, which provides a ‘sliding pane’ interface with contextual tabs determined by the pane’s content. Thus, a pane containing a Class Group will have different tab options than one containing a Class or a visualization. All the panes contain a query tab; code entered in that tab and executed produces a result that is displayed in the next pane to the right (which is created if it doesn’t yet exist). As an example, in a class group that consists of all the classes in a project, entering the following query will open a new pane and slide the current pane to the left. The new pane will contain all the classes that satisfy the query:
Self select: [ :each | each isAnnotatedWith: ‘Deprecated’ ]
All Smalltalk objects, including FAMIX model objects, understand the message ‘self’, since it’s implemented in the base Smalltalk Object class. Since the class group is a Smalltalk collection class, it understands the binary message select: with the block object as the second term. Each member of the Class Group is a FamixClass object, and FamixClass objects understand the binary message isAnnotatedWith: aString. In each pane, the last tab shows the meta-model attributes that that model object understands, so the available operations are handy to the query pane.
In the situation where we wanted to know what deprecated classes could be simply removed from a code base without causing a problem, we could write a query in the result pane such as this:
Self select: [ :each | each clientTypes isEmpty ]
This will create another pane with the deprecated classes that are not called by any other classes in the codebase, and can therefore be removed. For the sake of argument, let’s say there were 25 deprecated classes, 14 of which aren’t called by any other class, that leaves 11 deprecated classes that are still referenced. However, it would be nice to know in what way they are referenced: i.e. are the calling classes themselves deprecated? Is a deprecated class called by more than one client class? etc. Code queries aren’t ideal for this kind of analysis, though, so instead we can write a visualization such as this:
| view |
view := RTMondrian new.
view shape circle
if: [ :each | each isAnnotatedWith: 'Deprecated' ]
color: Color red.
view nodes: (self, (self flatCollect: #clientTypes)) asSet.
view edges connectFromAll: #clientTypes.
view layout force.
view view pushBackEdges.
view

This script uses the Roassal Toolkit and specifically the RTMondrian class, to produce the visualization, the results of which are below:

Although the details of how to write a script to produce all the possible visualizations are outside the scope of this overview, there are a couple of things worth noting. The message with the hash, #clientTypes, is known as a symbolic message. Symbolic messages are used by convention to generalize different messages for different object types that have a similar intended meaning. Thus #clientTypes may be interpreted as a different message by a FAMIX class object than by a FAMIX method object, but the use of the symbolic message generalizes the intent such that each responds with the appropriate result.
Other tabs that are available, depending on the content of the pane, include a summary (.) on a group of model objects, a UML class diagram in the case of a class or type group, etc. Aside from the pane containing the meta-model information for that pane’s content, there is a separate meta-model browser allowing the engineer to browse the full FAMIX meta-model to find what he or she might be looking for.

0 Comments

"little brother" IoT

“little brother” Analysis Tools

Leave a Reply.

Author

Archives

Categories