Projects and random notes

Thursday, November 27, 2014

Using Mac OS X Server for your own private email

The NSA spying scandal and the broad rules of seizing "database records" (aka your email) from ISPs; made me wonder what it would take to run your own email system.

It should be private - and run on servers I own.
It should run on a Mac.
It should require little administration once setup.
It can send mails to external domains.
It works with Frontier home internet server (dynamic IPs, blocking of SMTP ports)

So I purchased OS X Server 4.0 for OS X 10.10.1 (Yosemite). I was impressed with the functions that are thrown into this 19.99 dollar package. Anyway - setting up the mail server.

Setting up email to work locally in your network.

Start up OS X server
Enable the Mail service.
Create a domain in the mail service (i.e. my domain.com)
Create email addresses in this mail domain. (Each email address appears to map to a local user. I'm not sure how to setup mailboxes without creating users.)
At this point - you should be able to setup email clients that can use IMAP, POP to send and receive emails to each other (within the domain). However you cannot send or receive external email.
Use the log (Mail / SMTP Log) to get familiar with how you can observe the behavior of mail services. This will come handy later.

Internet - receiving mail from the net

I wanted a SMTP backup server - so email can arrive even when my server is down. I used dyn.com mail backup service. The other advantage is that their servers are more secure and can filter out a lot of the spam before it reaches your server and restrict the user names also. The downside is that to some extent your mail is stored on their servers. If I had a reliable & secure 24x7 server in EU countries, I may not need them - but that needs some more searching.
Go to Dyn.com - and create a dynamic host.
Download the DynDNS utility to automatically update the dynamic host with your public Internet IP.
Open the firewall ports - Use Firewall port forwarding to send the port 25 traffic to your Mac OS X server.
In Dyn mail backup setup, enter the dynamic host - this is what Dyn will contact if it gets an email for you.
Go to your domain's DNS - and setup the MX records to point to the dyn.com backup SMTP servers. If you want, add your own server as an MX record.
Test it out. Send yourself an email from yahoo/outlook/gmail...
Check the Mail logs in the Server App - to see whether Dyn contacted your server.

Internet - sending email out.

Edit your DNS to add a text record for SPF (I used - "v=spf1 mx ~all", which allows mail exchangers to send text for this domain).
If you are behind an ISP which restricts SMTP for home users (e.g. Frontier), you must use their SMTP relay host (smtp.frontier.com). (Setup your Frontier email address; check it works with your Mail app; use the email address and password in the Mail Services setup for relay host).
However, in case of Frontier, the authentication was plain text. There was no way to figure this out from the GUI. So drop down to Terminal and edit main.cf
sudo bash cd /Library/Server/Mail/Config/postfix vi main.cf smtp_sasl_password_maps = hash:/Library/Server/Mail/Config/postfix/sasl/passwd smtp_sasl_auth_enable = yes # eliminates default security options which are imcompatible with gmail smtp_sasl_security_options = # For debuggging debug_peer_list = smtp.frontier.com debug_peer_level = 3
Go back to the Mac Server App and stop / start the mail service. Check your mail logs - to see if you get an error that postfix is not started (in that case go back and check you type things correctly in main.cf above)
Use your Mail app to see that you can send out mail.

DKIM signing on Mac OSX

DKIM - domain key identified mail - is a secure way to ensure that the sender is a valid sender.
This Q&A on Apple discussion forum talks you through setting up DKIM.
Some changes are necessary for these to work on Yosemite (OS X 10.10.1).

The configuration files are located in /Library/Server/Mail/Config/amavisd instead of /etc/amavisd.conf
There is no /var/amavisd - so I used the above directory to store the dkim_key file.
To run the amavisd command use "amavisd -c /Library/Server/Mail/Config/amavisd/amavisd.conf showkeys".

To flush the cache on Yosemite - use
sudo discoveryutil mdnsflushcache;sudo discoveryutil udnsflushcaches
Alternatively shutdown your DNS server and restart it from Server App.
Send a new email from your Mail App.
(Tip - in the Mac Mail App for yosemite, use View>Message>All Headers - to see if a DKIM-Signature was generated.

Saturday, January 12, 2013

Enterprise architecture RAMPS

Summary

How do you design a computing system to provide continuous service and to ensure that any failures interrupting service do not result in customer safety issues or loss of customers due to dissatisfaction?

Historically, system architects have taken two approaches to answer this question: building highly reliable, fail-safe systems with low probability of failure, or building mostly reliable systems with quick automated recovery.

The RAS (Reliability, Availability, Serviceability) concept for system design integrates concepts of design for reliability and for availability along with methods to quickly service systems that can't be recovered automatically. This approach is fundamental to systems where the concern is quality of service, customer retention, and due diligence for customer safety.

This article was compiled from

Wikipedia:

http://en.wikipedia.org/wiki/Reliability,_availability_and_serviceability_(computer_hardware)

IBM:

http://www.ibm.com/developerworks/power/library/pa-bigiron2/

What is Reliability, Availability, Performance and Serviceability?

Reliability is evaluated by mean time to failure, and availability is measured by service uptime over a year of continuous operation. Reliability is a measure of the ability of a system to function correctly, including avoiding data corruption, whereas availability measures how often it is available for use, even though it may not be functioning correctly. For example, a server may run forever and so have ideal availability, but may be unreliable, with frequent data corruption.

Reliability of a system can be defined as the probability that it will produce correct outputs up to some given time t. Reliability is enhanced by features that help to avoid, detect and repair hardware faults. A reliable system does not silently continue and deliver results that include uncorrected corrupted data. Instead, it detects and, if possible, corrects the corruption, e.g., by retrying an operation for transient (soft) or intermittent errors, or else, for uncorrectable errors, isolating the fault and reporting it to higher level recovery mechanisms (which may failover to redundant replacement hardware, etc.), or else by halting the affected program or the entire system and reporting the corruption. Reliability is often characterized in terms of mean time between failures (MTBF), with reliability = exp(-t/MTBF).

Availability is the probability a system is operational at a given time, i.e. the amount of time a device is actually operating as the percentage of total time it should be operating. In high availability applications, availability may be reported as minutes or hours of downtime per year. Availability features allow the system to stay operational even when faults do occur. A highly available system would disable the malfunctioning portion and continue operating at a reduced capacity. In contrast, a less capable system might crash and become totally nonoperational. Availability is typically given as a percentage of the time a system is expected to be available, e.g., 99.999 percent ("five nines").

Availability %	Downtime per year	Downtime per month	Downtime per week
90%	36.5 days	72 hours	16.8 hours
95%	18.25 days	36 hours	8.4 hours
98%	7.30 days	14.4 hours	3.36 hours
99%	3.65 days	7.20 hours	1.68 hours
99.5%	1.83 days	3.60 hours	50.4 minutes
99.8%	17.52 hours	86.23 minutes	20.16 minutes
99.9% ("three nines")	8.76 hours	43.2 minutes	10.1 minutes
99.95%	4.38 hours	21.56 minutes	5.04 minutes
99.99% ("four nines")	52.6 minutes	4.32 minutes	1.01 minutes
99.999% ("five nines")	5.26 minutes	25.9 seconds	6.05 seconds
99.9999% ("six nines")	31.5 seconds	2.59 seconds	0.605 seconds

Serviceability or maintainability is the simplicity and speed with which a system can be repaired or maintained; if the time to repair a failed system increases, then availability will decrease. It includes various methods of easily diagnosing the system when problems arise. Early detection of faults can decrease or avoid system downtime. For example, some enterprise systems can automatically call a service center without human intervention when the system experiences a system fault. The traditional focus has been on making the correct repairs with as little disruption to normal operations as possible.

Performance defines the ability of the system deliver within design limits. For example, in ERP software applications, this can be expressed as the measure = end user click to click time for a transaction is within 10 seconds; design limits: a maximum of 50000 order lines per year, 250 named users with a maximum 20% concurrency i.e. 50 concurrent users.

Why does this matter?

From the end-user standpoint, computing systems provide services, and any outage in that service can mean lost revenue, lost customers due to dissatisfaction, and, in extreme cases, loss of life and possible legal repercussions. For example, with cell phone services, if it's only in rare cases that I can't get a signal or make a connection, I'll stick with my service provider. But if this occurs too often or in critical locations like my office or home, I'll most likely switch providers. The end result is loss of revenue and loss of a customer. Embedded systems not only provide value added services such as communications, but they also provide critical services for human safety. For example, my anti-lock braking system is provided by a digital control service activated by my ignition. My expectation is that this service will work without failure once ignition is completed. Any system fault that might interrupt service should prevent me from using the vehicle before I start to drive. A failure during operation could result in loss of life and product liability issues.

Ideally, service outages would not be an issue at all, but experienced system architects know they must analyze, predict, and design systems for handling failure modes in advance. For safety-critical systems, this is the due diligence required to avoid product liability nightmares. Historically, system architects have taken two approaches to this problem: building highly reliable, fail-safe systems with low probability of failure, or building mostly reliable systems with quick automated recovery. Both approaches are judged with probability measures. Reliability is evaluated by mean time to failure, and availability is measured by service uptime over a year of continuous operation. The RAS (Reliability, Availability, Serviceability) concept for system design integrates concepts of design for reliability and for availability along with methods to quickly service failures that can't be designed for automatic recovery.

Building systems for very high reliability can be cost prohibitive, so RAS offers an approach to balance reliability with recovery and servicing features to control cost and ensure safety and quality of service. This approach is fundamental to systems where the concern is affordable quality of service, customer retention, and due diligence for customer safety.

Lessons from the design of IBM mainframes

You can learn a lot from the experience built into systems like the IBM® mainframes that have evolved from a rich heritage of design for reliability, availability, and serviceability. This example explores elements of IBM mainframe architecture to assist those developing new architectures by examining the design decisions made in the big iron mainframes. This article gives background on the evolution of RAS features developed for IBM mainframes and summarizes significant design decisions.

To best understand the evolution of RAS in IBM mainframe architecture, it is useful to step back in time to 1964 and examine RAS features in the IBM System/360™ (see Resources) and consider how architects have balanced the issues of cost, reliability, safety, availability, and servicing, and improved upon this over time. Early systems were most often centralized rather than in the hands of end users, and may have been less cost-sensitive than today's mainframes, but the concepts of availability and reliability emerged early and have evolved over time into the well-proven RAS features now found in the z990.

Often, system and application requirements will determine if availability is stressed over reliability or vice versa. For example, the concept of availability has also been fundamental to telecommunication systems, where most often quality of service is more of an issue than safety. In contrast, reliability has been fundamental to systems such as commercial flight control systems where failure means significant loss of life and assets. The balance of availability and reliability features should fit the system -- building FAA (Federal Aviation Administration) levels of reliability into cell phones would make the system much less affordable and therefore not usable by many customers. Likewise, safety-critical systems can't simply quote uptime to convince customers that the systems are not too risky to trust -- fail-safe operation, reliable parts, triple redundancy, and the extra cost that goes along with these design features is expected and will be paid for to mitigate risk.

The difference between availability and reliability

Availability is simply defined as the percentage of time over a well-defined period that a system or service is available for users. So, for example, if a system is said to have 99.999%, or five nines, availability, this system must not be unavailable more than five minutes over the course of a year. Quick recovery and restoration of service after a fault greatly increases availability. The quicker the recovery, the more often the system or service can go down and still meet the five nines criteria. Five nines is often called high availability, or HA.

In contrast, high reliability (HR) is perhaps best described by the old adage that a chain is only as strong as its weakest link. Building a system from components that have very low probability of failure leads to maximal system reliability. The overall expected system reliability simply is the product of all subsystem reliabilities, and the subsystem reliability is a product of all component reliabilities. Based upon this mathematical fact, components are required to have very low probability of failure if the subsystems and system are to also have reasonably low probability of failure. For example, a system composed of 10 components, each with 99.999% reliability, is (0.99999)10, or 99.99%, reliable. Any decrease in the reliability of a single component in this type of single-string design can greatly reduce overall reliability -- for example adding just one 95% reliable component would drop the overall reliability to 94.99%.

The z990 includes features that allow it to be serviced and upgraded without service interruption. The subsystem level of redundancy is the "book," which is an independently powered, multi-chip module, with cooling, memory cards, and IO cards. There are also redundant components within a book including CPU spares (2 per book) and redundant interconnection to level-2 cache. Furthermore, the z990 provides jumpering to preserve the interconnection of MP (multi-processor) books while other books are being serviced or replaced in case of non-recoverable errors.

Service or system outages can be caused by routine servicing, upgrades, and failures on most traditional computing systems. Probability of a system or service outage on the z990 is limited to scenarios where failures are non-recoverable by switching redundant components or subsystems into operation. In most component or subsystem failure modes, the z990 is able to isolate the non-recoverable book or component and continue to operate with some performance degradation. Redundant books and components allow the z990 to operate without service interruption until the degraded book is replaced. As you will see later on in this article, designing interconnection networks for isolation and switching of modules like the z990 books is complex. While this complexity adds to cost, it does significantly decrease probability of service interruption. You can find more detailed information on processor book management in the "Reliability, availability, and serviceability (RAS) of the IBM eServer z990" paper (see Resources).

How high reliability helps

It is theoretically possible to build a system with low-quality, not-so-reliable components and subsystems, and still achieve HA. This type of system would have to include massive redundancy and complex switching logic to isolate frequently failing components and to bring spares online very quickly in place of those components that failed to prevent interruption to service. Most often, it is better to strike a balance and invest in more reliable components to minimize the interconnection and switching requirements. If you take a very simple example of a system designed with redundant components that can be isolated or activated, it becomes clear that the interconnection and switching logic does not scale well to high levels of redundancy and sparing.

A trade-off can be made between the complexity of interconnecting components and redundancy management with the cost of including highly reliable components. The cost of hardware components with high reliability is fairly well known and can be estimated based upon component testing, expected failure rates, MTBF (mean time between failures), operational characteristics, packaging, and the overall physical features of the component.

System architects should also consider three simple parameters before investing heavily in HA or HR for a system component or subsystem:

Likelihood of unit failure
Impact of failure on the system
Cost of recovery versus cost of fail-safe isolation

How cost and safety factor in

HA design may not always ensure that a design will be safe. Much depends on how long service outages will be during recovery scenarios. For safety-critical systems such as flight control or anti-lock braking, it is possible that even very brief outages could lead to loss of system stability and total system failure. Thus, for safety-critical systems, the balance between HA and HR must often favor HR in order to avoid risky recovery scenarios.

Achieving HA/HR with redundancy as the primary method

One approach to HA is to use redundancy and switching to not only increase availability, but so lower reliability (and most often lower cost) components can be used, which for some systems yields overall target HA at the lowest overall system cost. The best example of this approach is RAID (Redundant Array of Inexpensive Disks); see Resources for examples and links. In fact, numerous RAID configurations have been designed which make trade-offs between HA/HR and serviceability by using larger numbers of disk drives of various types including SCSI, Fiber Channel, Serial-Attached SCSI, and Serial-Attached ATA drives. RAID also can provide improved storage performance by striping writes/reads over multiple drives as well as using drives for parity or more sophisticated error encoding methods. Typical RAID systems with volume protection allow for a drive failure and replacement with automatic recovery of the volume and no downtime given a single drive failure. Protection from double faults or failures while a RAID system is recovering has become of interest more recently and has led to development of RAID 6. One of the more interesting aspects of the RAID approach is that it not only relies upon specialized HA hardware, but on fairly complex software.

RAS designs should span hardware, firmware, and software layers

Perhaps much harder to estimate is the cost of highly reliable software. Clearly, reliable hardware running unreliable software will result in failure modes that are likely to cause service interruption. It is well accepted that complex software is often less reliable, and that the best way to increase reliability is with testing. Testing takes time and ultimately adds to cost and time to market.

Early on, system architects focused on designing HA and HR hardware with firmware to manage redundancy and to automate recovery. So, for example, firmware would reconfigure the components in the example in Table 1 to recover from a component failure. Traditionally, rigorous testing and verification have ensured that firmware has no flaws, but history has shown that defects can still wind up in the field and emerge due to subtle differences in timing or execution of data-driven algorithms.

High reliability software often comes at a high cost

Designing firmware and software for HR can be costly. The FAA requires rigorous documentation and testing to ensure that flight software on commercial aircraft is highly reliable. The DO-178B class A standard requires software developers to maintain copious design, test, and process documentation. Furthermore, testing must include formal proof that code has been well tested with criteria such as multiple condition decision coverage (MCDC). This criteria ensures that all paths and all statements in the code have been exercised and shown to work. It is very laborious and therefore greatly increases the cost of software components.

Cost trade-offs in the hardware layer

Designing for HR alone can be cost prohibitive, so most often a balance of design for HA and HR is better. HA at the hardware level is most often achieved through redundancy (sparing) and switching, which is the case for the z990 and has been fundamental to IBM mainframe design since the System/360. A trade-off is made between the cost of duplication and simply engineering higher reliability into components to reduce the MTBF. Over time, hardware designers have found balance between HR and HA features to optimize cost and availability. Fundamental to duplication schemes is the recovery latency. For example, the z990 has a dynamic CPU sparing (DCS) feature that can cover a failure so that firmware is unaffected by reconfiguration to isolate the errant CPU and to switch in the spare.

When considering component or subsystem duplication for HA, architects must carefully consider the complexity and latency of the recovery scheme and how this will affect firmware and software layers. Trade-offs between working to simply increase reliability instead of increasing availability through sparing should be analyzed. A simple well-proven methodology that is often employed by systems engineers is to consider the trade-off of probability of failure, impact of failure, and cost to mitigate impact or reduce likelihood of failure. This method is most often referred to as FMEA (Failure Modes and Effects Analysis); see Resources. Another, less formal process that system engineers often use is referred to as the "low hanging fruit" process. This process simply involves ranking system design features under consideration by cost to implement, reliability improvement, availability improvement, and complexity. The point of low hanging fruit analysis is to pick features that improve HA/HR the most for least cost and with least risk. Without existing products and field testing, the hardest part of FMEA or low hanging fruit analysis is estimating the probability of failure for components and estimating improvement to HA/HR for specific features. For hardware, the tried and true method for estimating reliability is based upon component testing, system testing, environmental testing, accelerated testing, and field testing. The trade-offs between engineering reliability and availability into hardware are fairly obvious, but how does this work with firmware and software?

Cost trade-offs in the firmware and software layers

Designing and implementing HR firmware and software can be very costly. The main approach for ensuring that firmware/software is highly reliable is verification with formal coverage criteria along with unit tests, integration tests, system tests, and regression testing. Test coverage criteria include feature points; but for HR systems, much more rigorous criteria are necessary, including statement, path, and, in extreme cases, multiple condition decision coverage (MCDC). The IBM z/OS testing has followed this rigorous approach to ensure HR. (This testing is described in the article "Testing z/OS; see Resources.) The FEDC system in z/OS provides support for HA, recognizing that despite rigorous testing, software/firmware testing cost and time spent must be limited at some point, and the design should include support for quick operator assisted recovery. Finally, the FEDC is also very useful for application developers who most likely would also like to strike a balance between rigorous testing of their application and provision for recovery.

Component-level error detection and correction

Ideally, all system, subsystem, and component errors can be detected and corrected in a hierarchy so that component errors are detected and corrected without any action required by the containing subsystem. This hierarchical approach for fault detection and fault protection/correction can greatly simplify verification of a RAS design. The z990 ECC memory component provides for single-bit error detection and automatic correction. The incorporation of ECC memory provides a component level of RAS, which can increase RAS performance and reduce the complexity of supporting RAS at higher levels.

Redundancy management at the subsystem level

The z990 includes a number of redundancy management features that provide online replacement/upgrade, automatic recovery with spares, and continuous operation in degraded modes for double and triple faults that require servicing. The z990 organizes processing, memory, and I/O resources into logical units called books, which are interconnected so that they can be switched into or out of operation without interruption to the overall system. This scheme includes the ability to add and activate processors, memory, and I/O connections while the system continues to run. The fault-tolerant book interconnection and cross-book CPU sparing provides excellent automatic recovery as well for most fault scenarios.

Finally, support subsystems such as cooling also include redundancy so that the system is not endangered by thermal control system faults. Redundancy and management of that redundancy with automated fault detection, isolation, and automatic recovery is fundamental to the z990 RAS design.

Fail-safe design

HR systems often include design elements that ensure that non-recoverable failures result in the system going out of service, along with safing to reduce risk of losing the asset, damaging property, or causing loss of life. The z990 may or may not be used in applications that are considered safety critical -- arguably, even a database error could result in significant loss of assets (for stock market applications) or even loss of life (in certain health care applications). The z990 does incorporate fail-safe modes when recovery is impossible or too costly to incorporate (for example, double processor faults in a single book) and the likelihood of a failure is low. In the case of double or triple faults, the z990 isolates the failed subsystem (a book) and requires operator assistance -- for most users. This is likely a good cost-versus-HA/HR trade-off, given the built-in support for serviceability in the z990 (on-line replacement).

Serviceability concepts

The z990 strikes a nice balance between HA, HR, system safety (permanent loss of data could have high related risk), and simplicity of operation and servicing. In most cases, this tracking of errors, data logging and upload to IBM with RETAIN and configuration tracking of FRUs (field-replaceable units) simplifies service calls, sometimes (or often) allowing a technician to handle a non-recoverable failure or upgrade without causing service interruption. The z990 organization of books as field-replaceable units with support to assist the operator in recovery greatly increases the z990 serviceability for faults that occur despite the HR/HA design.

Recovery concepts

Conceptually, architects should consider how levels of recovery are handled with varying degrees of automation, as depicted in Figure 2.

Figure 2. Supporting multiple levels of recovery autonomy

The z990 approach includes all levels of recovery automation and management between fully automatic, operator assisted, and fully manual. Manual recovery requirements are minimized, and manual recovery still includes isolation features so that FRUs can be replaced easily to restore full performance without service interruption and without impact to other processor books.

Putting it all together

The z990 is an excellent example of good RAS design because RAS is considered at all levels of hardware, from components and subsystems to the system level, and because firmware/software design for RAS is considered in addition to hardware. Perhaps most important, though, is the fact that the z990 RAS design spans all of these levels and layers so that overall RAS is a system feature. The well-integrated RAS features of the z990 no doubt increase the cost of the system considerably. However, the z990 provides customers with an HA/HR computing platform with low risk of losses due to downtime. The cost trade-off will vary for each system design based upon the risk and cost associated with downtime compared to the cost of decreasing the overall probability of downtime. The system architect has to find the right balance of cost, complexity, recovery automation, reliability, and time to market for each, given the system's intended usage. There is no simple formula for HA/HR analysis, but one place to start is with careful consideration of what the risk and cost is of being out of service -- if this can be measured in dollars lost per hour or day, potential loss of life, or loss of customers, this is a good starting point. If my business will potentially lose thousands of dollars per hour while my enterprise system is out of service, then I will be more willing to pay much more for HA/HR.

Is autonomic architecture the future for RAS design?

Strategies for RAS like the z990 have been refined and improved significantly since the concepts of continuous availability were introduced in early mainframes like the System/360. Clearly, RAS requires balance between safety and uninterrupted service on the one hand and the cost to provide these features on the other. The cost of additional RAS performance can be high, and this cost must be balanced with the risk and cost of occasional service failures and safety. How much is enough? How can the additional cost of better RAS be financed?

One concept is that systems that require little to no monitoring are not only more cost efficient, but necessary, if automation is to scale up significantly beyond present day systems. Enterprise systems include processing, storage, and I/O, with thousands of interconnections, hundreds of processors, and terabytes of data. Quick access to widely distributed information for decision support, global operations, and commerce is required, and the volume and speed of information flow is scaling beyond the point where traditional monitoring and service methods can be applied. Autonomic architecture is an alternative to traditional systems administration that uses RAS as a starting point to further reduce the human attention required to maintain enterprise systems. A full description of autonomic architecture is beyond the scope of this review of RAS strategies; however, you can find more information in the Resources section below.

Resources

This article was extracted from

Wikipedia:

http://en.wikipedia.org/wiki/Reliability,_availability_and_serviceability_(computer_hardware)

IBM:

http://www.ibm.com/developerworks/power/library/pa-bigiron2/

Saturday, December 17, 2011

Using Neo4J to visualize SAP PI interface data.

Neo4j is a high-performance, NOSQL graph database with all the features of a mature and robust database. A graph database is ideally suited for highly connected data. Marko Rodrigues compares the performance of MySQL and Neo4J for a graph query with 1,2,3,4,5 connections. The information model for a typical SAP PI setup is highly connected. Lets look through the concepts that I want to visualize and explore.

SAP PI system,
Party, Service (or Component), Interface, Interface Map, Message Map,
Communication Channel and runtime information of all messages (approx 80K per day = 28 M per year).

It should be fast and let me navigate relationships between objects. I would like to be able to make a query like
Show me all messages between 3:05am and 3:55am with all the relevant information about parties etc.
Show me summary of all messages between 3:05am and 3:55am with all the relevant information about parties etc.
Show me which interface flows (sender/receiver) are effected if a communication channel is down.

Some notes on implementing Neo4J
To setup Neo4J, download the Community version from http://neo4j.org . After you get the Neo4J server running, launch the browser at http://localhost:7474. Alternatively, you can launch the neo4j-shell command to look around. These are the shell commands that got me started.

cd
mkrel --cd -c -d o -t system -v --np "{'name':'PXP'}"
mkrel --cd -c -d o -t party -v --np "{'name':'PCE_LNSFA','type':'XIParty', 'partyagency':'http://sap.com/xi/XI','partytype':'XIParty'}"

The first line takes you to the reference node which is created by default in the starting database.

The second line creates an outbound relationship called "system" and a new node at the end of the relationship. The new node has the property "name" = "PXP". The "name" property is used by the shell to give a friendly name to the node id. The "--cd" option moves you to the newly created node.

The third line creates an outbound relationship called "party" and a new node at the end of the relationship. The new node has the property "name" = "PCE_LNSFA" and a few more properties. So lets take a look at our graph.

trav

The first line takes you to the starting node and the second line will traverse the entire graph for you.

mkrel --cd -c -d o -t interface -v --np "{'name':'IOA_ChangeCRMTaskRequest','type':'XIInterface', 'namespace':'http://philips.com/pce/lotesnotes/sales/tasks','addressid':'4ACB55C22850085AE1008002828BD49C'}"

index -i interface addressid

The first line here creates outbound relationship called "interface" and a new node at the end of the relationship. The new node has the property "name" = "IOA_ChangeCRMTaskRequest".

The second line adds the newly created node to an index called "interface", with the property addressid stored in the index. This makes it easy to look up the node later using the following

index --cd interface addressid "4ACB55C22850085AE1008002828BD49C"

This will find the interface node and take you directly to that node. If we had indexed the first two nodes like this

index -i myindex system

index -i myindex xitype

we could have searched using

index --ls -q myindex "system:PXP AND xitype:service"

Once you get used to the basics, you can try your hand at advanced queries (or traversals in case of a graph). Take a look at the syntax of Cypher query language to understand this.

start n = (0)

match (n)-[:systems]->(sys)-[:party]->(parties)-[:service]->(service)

where sys.name = 'PXP'

return n,sys,parties,service

The variables are specified as (varname). Basic syntax is

start

match

where

return

You can use aggregate functions as well.

start n = (0) match (n)-[^1..4]-()-[:xi_interface]-(i) return i

start n = (0) match (n)--(p)-[^1..4]-()-[:xi_interface]-(i) return n,p, count(*)

The first line will display all nodes that are linked 1-5 relationships away from the reference node and the last link is xi_interface.

The second line, sums this up - grouping by the (p) - i.e. the system. So it displays number of interfaces per system.

start n=(message,'firstday:20110930 AND firstts:[20110930000000 TO 20110930005900]') match (n)-[:sends]-(s) where s.system='PXP' return s.party,count(*)

start n=(message,'procmode:S') match (n)-[:sends]-(s) where s.system='PXP' return s.party,s.service,s.interface,count(*)
start n=(message,'procmode:S') match (r)-[:receives]-(n)-[:sends]-(s) where s.system='PP4' return r.party,r.count(*),avg(n.latency),avg(n.msgsize)
start n=(message,'procmode:S') match (r)-[:receives]-(n)-[:sends]-(s) where s.system='PXP' return r.party,r.service,r.interface,s.party,s.service,s.interface,n.firstts,n.lastts,n.latency,n.dbentry,n.maprequ,n.msgsize,n.msgid
start n = (0) match (n)--(sys)--()--()--(iface)--(message) where message.latency > 2000 return iface,count(message)

Thoughts after the experiment
After a certain number of records, the database slows down. I tried this with a database of 22GB. Some queries did not return after a long time. So you may want to split into multiple databases. The web interface is fine to visualize limited graph nodes - but does not scale up. Use the command line when you have too many nodes.

Sunday, September 04, 2011

My interview with Kevin Benedict

Kevin Benedict interview me at the Enterprise Mobility 2011 conference in Brussels, Belgium. We discussed topics that are important as you develop a mobility strategy for your business. The interview is a part of the Mobile Expert Video series.

See the interview at Kevin's site

See the interview directly on Youtube

Setup iPads for an Enterprise setup

After getting some requests on how we setup iPads at Philips, I'm sharing the basic setup that we use for iPads at Philips.

Request the Afaria user ID and password. This will be needed in the following steps.
Install the SIM card

Connect the iPad to a computer with iTunes. Disconnect once the “connect to itunes” disappears from the iPad. (i.e. do not setup the iPad in iTunes).

On the iPad, go to Settings>Wifi and connect to WLAN-PUB.

Click the URL http://t-systems.mobidm.com/start to start enrolling your device.

Click the first button to enroll your device. This will bring up a request to enroll your device and install a profile. Accept.

Go back to the Safari browser and click the second button (to download Afaria).

In the App-Store, click “free” to start the download. When the AppleID dialog is opened; click “Create new account”. Follow the steps to create a new account - however do not enter credit card details (i.e. select None for credit card type). (You can also click here to download Afaria).

Ask the secretary to forward the email to you.

Click the link in the email to verify the email and finish the verification / setup of the apple id.

Install Afaria. Then go back to the browser where T-Systems page is showing. Click the last button to configure Afaria.

Install Brainloop

Install Penultimate

Install Adobe Ideas

Install Socialcast (Note: After installation, start the app and click the Gear icon or setting icon - to change “api.socialcast.com” to “connectus.socialcast.com”)

Install Pages

Install Numbers

Install Keynote

Install Goodreader

===== (The rest needs personal information specific to the end user of the iPad) =====

Setup Passcode

Create a mail account with the exchange settings (Server:www.mail.philips.com Domain:Code1 UserName: (nly or usd etc.) Password)

Setup ConnectUs/Social cast.(Note: Start this, click the Gear icon or setting icon - to change “api.socialcast.com” to “connectus.socialcast.com”)

Setup Brainloop

Saturday, June 11, 2011

XCode 4.3 with IOS 5 beta

After downloading XCode 4.3 beta with the IOS 5 SDK, the Organizer function to share and archive stopped working with a cryptic error "No such file or directory found".
It turns out that this is related to having two different versions of codesign_allocate . To fix the problem, do the following in a terminal window.

sudo ln -s /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/codesign_allocate /usr/bin

After that you can share the IPA with organizer again.

You can also package from the Terminal window with the following commands in the directory containing your XCode project.

xcodebuild -target LocateCustomer -sdk iphoneos build

/usr/bin/codesign -f -vv -s "Divya Mahajan (5E58XNSTHR)" build/Release-iphoneos/LocateCustomer.app

xcrun -sdk iphoneos PackageApplication -v "build/Release-iphoneos/LocateCustomer.app" -o build/LocateCustomer.ipa --sign "iPhone Developer: K Developer (5E68XNSTHR)"

--- Here LocateCustomer = Target name in the project.

Sunday, August 29, 2010

First impressions - Windows 7 + Office 2010 - from July 2009

I've just finished installing Windows 7, Visual Studio 2010 beta 2, Office 2010 beta with Sharepoint Designer 2010. Since these are all beta programs, I've installed them in a virtual machine. This avoids the need to dual boot and I can use my existing OS alongside.
Setup:
Windows 7 x64 - with 2GB RAM, Single CPU @1.83GHZ. VMWare installation is on an external USB drive.
Installation issues:
With the above combination, installing Azure tools for Visual Studio is difficult. It requires NET 3.5 SP1 which is preinstalled in Windows 7. However the Azure installer doesn't recognize this. I'll figure this puzzle later.

Performance:
Office 2010 performance is amazing. The startup times for all programs are so short that they jump out!
Powerpoint 2007 took a good 1-2 minutes to startup and felt like a whale. Powerpoint 2010 feels snappy by comparison. Even Outlook's shutdown was quick

Look and feel:
Office 2010 continues the move to the Ribbon. All programs have the Ribbon interface. The overall look is cleaner and "sparse". The programs try to stay out of your way so you can concentrate on your document.

Bugs?
Outlook 2010 couldn't save my credentials.

Thursday, October 22, 2009

SharePoint Resources

SharePoint Resources: "SharePoint Resources" lists Internet sites built on Sharepoint platform. It has links to other resources also.
See study on Kraft sites and Kroger sites.

Monday, October 19, 2009

Insulating an Unfinished Attic Tutorial

Insulating an Unfinished Attic Tutorial

Keeping this link for future home projects. It takes you step by step through the process of insulating your attic.

Friday, October 16, 2009

Extending Sharepoint search with custom properties

Situation:

Empower.Me would like to store marketing documents, pictures, PDFs etc in Connect Share. The items are stored in a Sharepoint document library with a custom content type. This custom content type has additional properties that further describe each item. These properties are business relevant and help in categorization. Examples: Validity dates, Philips business unit, CTN or product identifier, etc.

Some of these properties can be mapped to standard (out of box) Sharepoint properties, but some are new to Sharepoint.

EmpowerMe would like to find the documents that match certain property values (instead of a full text search).

Investigation results:

The Connect Sharepoint environment built on Microsoft Office Sharepoint provides 3 standard methods for searching items.

Search page (https://www.emea.sharepoint.philips.com/Search/Pages/results.aspx)
Advanced search page (https://www.emea.sharepoint.philips.com/Search/Pages/Advanced.aspx )
Search web service (https://www.emea.sharepoint.philips.com/sites/TS0903310710272080427802/_vti_bin/search.asmx?WSDL)

The results returned by these three alternatives are identical.

All of the above methods permit you to search by keywords / property values or by full text. To query a property, just include <propertyname>:<value> in the query. If the value has a space, enclose it within double quotation marks. An example of this is in Exhibit 1.

Exhibit 1: Search for PDF documents by author (Rinia, Jornie) in the empowerMe site

Site:https://www.emea.sharepoint.philips.com/sites/TS0903310710272080427802/ filetype:pdf Author:"Rinia, Jornie"

Now let us try this syntax on items with custom properties. We take an item from the EmpowerMe Document Library. This item has a custom property “ChapterName” = “Introduction to empower.Me”. It also has a standard property “Title”=”Introducing the empower.me…..”. (See Exhibit 2)

Exhibit 2: Item with a custom property “ChapterName”

A search on documents with content type “Empower.me Guideline” shows up 3 documents.

Exhibit 3: Search for ContentType “Empower.me Guideline”

Site:https://www.emea.sharepoint.philips.com/sites/TS0903310710272080427802/ Author:"Vaassen, Guido" ContentType:"Empower.me Guideline"

A search on the property ChapterName doesn’t seem to work. It returns no results. (see Exhibit 4).

Exhibit 4: Search for ChapterName containing Introduction

Site:https://www.emea.sharepoint.philips.com/sites/TS0903310710272080427802/ Author:"Vaassen, Guido" ChapterName:Introducing

However, if you search on Title, you get these two results. See Exhibit 5.

Exhibit 5: Search for Title containing Introducing

Site:https://www.emea.sharepoint.philips.com/sites/TS0903310710272080427802/ Author:"Vaassen, Guido" Title:Introducing

So it appears that custom properties are ignored by the Search in Connect Share.

However, looking a bit deeper, you can see that there is a standard way to extend Search to consider these custom attributes. A bit of background first.

When the search engine “crawler” indexes items in Sharepoint, it stores all custom properties in the index as “Crawled properties”. These crawled properties are ignored during search (except if it is a string, it will be included in the full text search for that item). The custom property “ChapterName” is included in the crawled properties. Microsoft Office Document properties are also available as crawled properties.

The properties that are available for searching are called “Managed properties”. You must create a new Managed Property and map it to your “crawled property”.

Fortunately, extending search for your own attributes does not require any programming or coding. Instead, creating and mapping a new managed property can be done in a few minutes by a Sharepoint site administrator. (See instructions at http://j.mp/3BnZnQ and http://j.mp/3ulrlT )

Once you have a managed property, the administrator can change the search options for this managed property.

· In the Search Options section, select Include this property in Advanced Search options to allow users to perform advanced searches using this property.

· Select Include this property in the content index to include this property in the content index, so you can search for items based on this property.

· Select Allow property to be displayed to make this property available for display in custom search applications.

· Select Display this property in item details in search results to display this property in the Item details section for each item in search results.

An additional enhancement is the concept of search scope. Search scope restricts the selection of documents that are returned. As an example, a search scope called Marketing can be setup to search all items with the content types setup for marketing, like “EmpowerMe guidelines”.

Conclusion

Connect Share can be quickly extended to add additional properties that are relevant to a parametric search. Parametric search combined with scope provides a method to make search more relevant to the users.

External web references:

Searching on property values in Microsoft Search: http://j.mp/WGAjh

Program to list all crawled and managed properties in Sharepoint (Need sharepoint administrator access to /ssp/admin):

Generate a Extending Microsoft search for image search: http://j.mp/1Gqj93

Managing “Managed properties”: http://j.mp/3BnZnQ and http://j.mp/3ulrlT

Monday, October 12, 2009

Virus, Spyware & Malware Protection | Microsoft Security Essentials

Virus, Spyware & Malware Protection Microsoft Security Essentials

You don't need to buy an antivirus package anymore!

- Posted using BlogPress from my iPhone

Monday, September 28, 2009

Posting on the fly

Twitter and Yammer are fine for short mobile messages but blogging from my itouch beats both of them.

Thursday, May 21, 2009

P2P networking

To start with Peer to Peer networking, install the Peer 2 Peer component (XP – Add remove programs>Add remove Windows Components>Network components>Peer 2 Peer).

Turn on your IPv6 stack.

net start p2psvc

netsh p2p pnrp cloud show list

I was not able to connect with the seed server

netsh p2p pnrp diag ping seed

To turn on Teredo based IPv6 for your PC. On XP – make it the DMZ server if behind a NAT firewall or redirect ports 3544 and 3540 to your machine.

netsh interface ipv6 set teredo client teredo.ipv6.microsoft.com

netsh interface ipv6 set teredo enterpriseclient
(if on a Domain)

(I also found teredo.alicenet.de but it did not give me an address).

The IGD tool from Microsoft helps you evaluate your network connection to the Internet.

A nice diagram on connecting IPv6 and IPv4.

A good post on setup of IPv6 (with Teredo) at home; setup of IPv6 with tunnelbroker and more at this link.

This post discusses how you can setup a Teredo broker (Miredo). It also lists some Teredo servers around the world.

When Teredo is running on my PC, I am able to ping6 ipv6.l.google.com; but I cannot get IE or Firefox to use IPV6. Both browsers use IPv6 when I use the Hexago Gateway IPv6 Tunnel client.

Installing Ubuntu 8.1 (Incredible Ibex)

Download the ISO from the Ubuntu site. It was very easy to install, however the updates took quite a while (2 hrs @ 16Mibps).

Links

Installing a web proxy server

Setup FreeNX for a fast remote desktop client

When looking at the out-of-box applications, I got interested in the VOIP softphone (Ekiga).It has a Windows client as well, but that was not easy to locate. I got a free SIP address for myself. (sip:cs905s@ekiga.net)

While going through the setup of Ekiga, the NAT detection gave “STUN test result: Port restricted NAT”. Since I didn’t get any meaningful explanation on Google, I opted to enable STUN support. For a headphone, I opted to pull out my Bluetooth USB adapter and hook it up to the Motorola Bluetooth headset. (Lost the manual and found this nice tutorial on the net. I must wait for 2 hours for the headset to charge. I need to figure out how to connect the headset to the USB. Not that simple?Rather easy compared to Windows XP)

Ekiga provides a call out service (diamondcard.us) where I created a divyamahajan account. However it is still on a security hold even after the paypal payment was done.

Eventually, Ekiga did not work.

Decided to throw in the towel and leave it.

IPv6 native providers ISPs

rh-tec.de does provide native IPv6 in Germany. They are on the pricey side however.

For a longer list of native IPv6 providers, try this link at Hurricane Electric. Their state of IPv6 report is worth reading.

Tuesday, January 13, 2009

Linking popup information with Apture

www.apture.com provides a simple way to embed information from other sites into your webpage. It is useful to link multimedia files, Wikipedia lookups etc., without requiring the user to leave the context of your web page.

Thursday, January 08, 2009

Cleaning up your Start menu

Download Orphans Remover Now

Most of the times when you install a new application, shortcuts (*.lnk files) are created as well to help you to launch the application faster. Desktop, start menu and user’s document are the common location where you can find the shortcuts. The shortcuts are useful as long as the applications are not uninstalled from your system. Once the shortcuts are broken, it could become a problem to your Windows.

Most of the cases, shortcuts become broken when you remove or uninstall programs that have shortcuts using the Add/Remove Program in the Control Panel. The un-installation is never clean, leaving leftovers behind e.g. broken shortcuts. These shortcuts are no longer needed; therefore need to be removed from your Windows.

To find all the broken and invalid shortcuts on your system can be a tedious and time-wasting job to you. Instead of doing it manually, you should try Orphans Remover. Orphans Remover is a freeware Windows application that searches and deletes broken shortcuts (*.lnk files) on your Windows desktop, start menu, recent documents and more. On the main window of Orphans Remover, you can specify the folders that you want to scan for broken shortcuts. Orphans Remover can search for broken shortcuts in Windows start menu, desktop, favorites, history, recent documents, temp directory, program files and application data. You also can expand broken shortcuts scanning to include files on removable drivers, a network, CD-ROM drives and RAM disks. Besides, Orphans Remover supports for user defined folder where you can scan others directories for broken shortcuts other than the available locations.

After specifying the locations for scanning, click the “Start Scan” button to scan for broken shortcuts. After a successful scanning process, all the broken shortcuts are displayed. To delete the broken shortcuts, click on the “Delete Orphans” button.

Monday, January 05, 2009

Creating a IPv6 subnet at home

With multiple computers at home, I want to setup all with IPv6 addresses. For that need to get some questions answered:

1. Who will assign me a set of global unicast IPv6 addresses that I can use?

2. How will IPv6 traffic get routed between these computers to other IPv6 computers?

Since my ISP (Hansanet Alice) does appear to support native IPv6, I must use a IPv6 tunnel over IPv4. I will use the Freenet6 (Go6.Net) service with Hexago Gateway6 client.

After a lot of reading, the actual setup turned out to be surprisingly easy.

Setup

Windows XP (Gateway6 client and router) – Install the Gateway client. First verify it is working and then set it in router mode. Here are the steps I used - In the Gateway6 client application, click the Advanced tab. At the bottom, select the “Enable Routing Advertisements”. Select the LAN or wireless interface that is your local network. (IMPORTANT: If you don’t select a valid interface, netsh crashes when you try to connect).

Windows Vista (automatic IPv6 configuration) – Ensure the Vista machine is connected to that network. Restart the machine or simply disable and re-enable the network adapter. (Start Run “ncpa.cpl” and disable / enable the LAN or wireless connection).

Windows XP (automatic IPv6 configuration) – Check if IPv6 is installed (From the command line, ipconfig /all. IPv6 is installed if there are any fe80:* addresses. Install it with “ipv6 install”.) Restart the machine or simply disable and re-enable the network adapter. (Start Run “ncpa.cpl” and disable / enable the LAN or wireless connection).

In both cases, you can use “ping –6 ipv6.l.google.com” to verify that you are connected to the IPv6 internet.

The real test is to check if your machines can be reached from outside by other IPv6 machines.

Useful commands

ipv6 if 1 -- See ipv6 details of the interface. Change the number 1 to other numbers to see other interfaces.

netsh interface ipv6 show address – See all ipv6 addresses assigned on your machine.

netsh interface ipv6 show route – See route on your machine.

VMware virtual machines

The above setup worked with VMware virtual machines with two caveats. The network adapter should be in “bridged” mode and it should be connected to an ethernet interface (VMware does not support IPv6 when you bridge to a wireless card -- forum bug report post).

Nice links

Microsoft’s introduction to TCP/IP

Hexago’s deploying IPv6 over IPv4

Interactive tutorial on IPv6

Found this blog describing the steps and more.

Friday, January 02, 2009

Getting on to the IPv6 bandwagon

While waiting idly for a download to complete, I decided to see what would it require to move to IPv6 on my home network. A quick search indicated that my ancient Netgear WG834GB does not support IPv6. I got a bit distracted by Jonathan’s pages on installing OpenWRT on WG834G. The WG834GB runs a Linux variant and you can telnet into the machine and look around. However I don’t have an alternate wireless router so I did not take the risk of installing OpenWRT.

After reading more about IPv6, I decided to go ahead and take a plunge. Hexago had a nice article on how to go about with IPv6 over IPv4. Microsoft gives a good overview on home setup of IPv6.

I went to go6.net and got registered as cs905s. Downloaded and installed Hexago’s Gateway6 client from go6.net. Start up the client and enter the broker address (broker.freenet6.net) and your user/password. It connected and provided me an IPv6 address (2001:05c0:1000:000b:0000:0000:0000:1b5d) and a brokered address

Later I used another broker - broker.aarnet.net.au. I created another cs905s account – the server sent a random password back. This broker is based on Hexago – so I could use the Gateway6 client with it too. Unlike the freenet6 broker, it did not provide me with a brokered address

Firefox did not like ipv6.l.google.com but Internet Explorer v7 had no problems connecting to the site.

The http://www.sixxs.net/tools/ipv6calc/ site provides some quick information about your IPv6 link. It also provides some fun links, like Virgin Radio.

My conclusions (not verified). Use IPV6 to give unique addresses to all PCs even those behind the NAT. There are two ways to do this, Teredo tunneling and native IPv6. Use the Hexago client and server at this point to get IPv6 addresses for existing machines. I’ll have to look into the native Win2008 and Vista support for ipv6. An alternative to Hexago’s client is AICCU at SixXS. I had to upgrade remote desktop to support IPv6, since the XP version doesn’t support it.

In IPv6 classes no longer exist (Class A, B, C…). Infact even in IPv4 they are dead. The replacement is CIDR which allows variable length network prefixes. This link lets you calculate your CIDR.

“A subnet mask is a bitmask that encodes the prefix length in a form similar to an IP address: 32 bits, starting with a number of 1 bits equal to the prefix length, ending with 0 bits, and encoded in four-part dotted-decimal format. A subnet mask encodes the same information as a prefix length, but predates the advent of CIDR.

CIDR uses variable-length subnet masks (VLSM) to allocate IP addresses to subnets according to individual need, rather than some general network-wide rule. Thus the network/host division can occur at any bit boundary in the address. The process can be recursive, with a portion of the address space being further divided into even smaller portions, through the use of masks which cover more bits.”

The Microsoft site has a very good introduction on TCP/IP . If you are deploying in the office, look at this PDF (link)

Unrelated – Teracopy - good tool for copying files http://www.codesector.com/teracopy.php

UPDATE: Found this blog describing the steps and more.