Feeds:
Posts
Comments
I decided to play with some numbers over the weekend, inspired by the unbelievably bizarre events in the US senate during the Kavanaugh confirmation hearings.
A core principle of free and fair elections, a cornerstone of democracy, is one vote per person and that all votes are counted equally, right? Not so in the US.
So, to put some numbers on it, I dug up the population by state and did some simple calculations.
The representation in the House of Representatives is pretty closely matched to the population, there’s always going to be some differences since it’s based on the previous census, but it’s close enough. That part is OK.
The US Senate is another story though. Each state has 2 senators for a total of 100 for the 50 states. It may have been a good idea back in the late 1700s, but it’s absurdly skewed today. I live in California and we have 2 senators representing the roughly 40 million people living here. That’s 12.1% of the US population. Now, if you try to match that population by adding up the states from the smallest and up until you reach 40 million – you get 22 states with 44 senators (out of a 100). That’s 44% of the US senators. So on one hand you have the 12.1% of the population in California represented by 2% of the senators. The other 12.1% (adding up smaller states) have 44% of the Senate.
Note, this is just a numbers game, strictly by population numbers by state. I haven’t weighted it by party, but if you look at the map below, your see that this favors one party much more than the other.
USA senate vs population.001
Now, I could make a very similar argument for the underrepresentation of Texas, Florida and New York (together with California, they are the 4 largest states by population). You would need to add up 34 state populations to match those 4 largest states, i.e. 68 senators vs the 8 that represent California, Texas, Florida and New York
There are arguments, for checks and balances and to not go strictly by population numbers to make the voices of the smaller states heard. I can understand that, but the numbers are so skewed today that the current system just doesn’t make sense.
Maybe it’s time to limit the impact of the senate? Much like European parliaments have adjusted to the times and limited (or abolished) the upper houses of their parliaments. The UK is a good example of this, where their upper house, the House of Lords, can scrutinize bills, and force reconsiderations, but rarely is able to stop a bill.
To make matters worse, the election that is supposed to chose the president for all of USA is also skewed towards the smaller states. Not as extreme as the Senate, but enough so that we got a White House Squatter that took possession with 2.8 million fewer votes than the woman that in any normal democracy would have been the winner. If the president is supposed to represent the people of the USA, then who that is should be decided based on the total number of votes nationwide, without an electoral college.
Just saying…

At an interesting intersection

I haven’t published anything here on my personal blog for a while, a few years actually. Time flies.

The last time I wrote something here I was with a startup called BlueArc. BlueArc was acquired by Hitachi Data Systems pretty soon after that article was published (no connection to my article though 😉 ) and I got busy learning the new environment.

img_banner

More recently, during the last year and change, I’ve been involved in some new business areas we’re starting up and until recently I haven’t been able to talk publicly about this and, especially, the yet-to-be announced solutions. We’ve been like a startup inside a bigger company, exciting and fast paced – for a large company.

However, we’re now at a point where some of these solutions have been introduced to the market (and there’s more to come). These solutions are what we call “Social Innovation” solutions. They live in a very interesting intersection of leading edge technology (Big Data analysis and the Internet of Things), deep industry expertise and a focus on meeting big societal challenges.

Sometimes you just happen to be in the right place at the right time 😉  Social Innovation at Hitachi is one of those places just now.

Screen Shot 2015-04-23 at 11.42.07 AM

So what is this really?

There’s more on the company view of Social Innovation herehere and here, plus a blog I recently wrote in the HDS Community. That’ll hopefully give you some background to start with. Frost & Sullivan has also published white papers on Social Innovation if you want to read more.

Most people know Hitachi Data Systems (HDS) for its mission critical infrastructures to manage data for banks, hospitals, manufacturing companies etc – i.e. companies whose success depend on the availability of their data. What’s less known is that we’ve been investing also in analytics technologies and solutions for vertical applications. For example, last year we acquired a couple of companies in the Public Safety part of Smart Cities – Pantascene and Avrio. This is now the base for the Hitachi Visualization solution for Public Safety and Smart City

More recently we announced the intent to acquire Pentaho, a leader in big data analytics and visualization. That deal is going through the regulatory checks and will close as soon as it has cleared that phase. It’s all part of completing the portfolio.

Screen Shot 2015-04-23 at 11.50.39 AM

Another piece of the puzzle is that HDS is part of the larger Hitachi Group. Most of you have probably experienced other Hitachi products – like power tools, TVs, etc. Most of the Hitachi involvement in terms of non-IT products and soilutions are however more on the commercial and industrial side, vs consumer.

Hitachi has broad hands-on experience from many different industries. In several cases this is end-to-end, like in Healthcare where Hitachi makes sensors and instruments (e.g. MRI or Ultrasound scanners), treatment systems (e.g. for Proton Beam Therapy), diagnostic analysis solutions, clinical repositories of data –and- Hitachi also owns and runs hospitals. So it’s true end to end. Very unique.

That’s just one industry as an example. Below is  small sampling of industries where Hitachi has a footprint.

Screen Shot 2015-04-23 at 11.54.03 AM

We’re now applying that broad experience into what we call “Social Innovation” in a range of different market segments. The goal is to enable safer, smarter and healthier societies. This may sound like lofty goals, but it actually aligns with the long time (think century) mission of Hitachi, to provide technological solutions to improve society.

Hitachi mission: “Contribute to society through the development of superior, original technology and products

How come I’m involved in this now?

Part of it is luck and good timing, but maybe also unconsciously moving to be in the right general area of the industry. 7951-opportunity

Being at the intersection of Big Data and Internet of Things (IoT) is one thing, but I’ve also always had as a criteria for where I work that it needs to be a company that is “good”. When I joined Sun many years ago it was partly because they were doing really interesting things with technology and was breaking the dominance of the current players at that time, but it was also about aligning with the customers and being the “good guys” in the industry – driving for sharing technology through open source.

More and more people also realize that true and lasting success is also about looking beyond the quarterly financial results, your real stakeholders are much more than just the shareholders. This was recently discussed in a Forbes article  and we’re actually conducting a poll on our community site about a similar question just now. It’s going to be interesting what the results will be. Please go ahead and cast your vote!

Now at Hitachi, my work, focus and values are along the same lines as you can see above. However, now it’s also about non-IT technology, Operational Technology or OT, and how we can apply technology to solve the bigger challenges. Take water for example, a very hot topic here in California today and also something I wrote about in a blog post here in 2009 that now seems almost like precognition. (Btw, I’ve built on my own advice, last year I installed a water tank to collect rainwater, so I expect our vegetable garden this year to be totally irrigated by the rainwater I collected during the few rains we’ve had this season.)

021614noswimming                    OLYMPUS DIGITAL CAMERA

However, the signs have been there a long time, it’s just about assembling the data and put the light on some of the insights. This applies across many more areas – think about the impact on healthcare due to our aging population or the impact on city infrastructures as more and more people move into urban areas. I’m enjoying this mix of IT and OT.

I’m still keeping my connection to technical computing alive as well, I was recently interviewed about Visualization for the American Oil&Gas Reporter.

Anyway, this was a quick run through to catch you up with what I’ve been doing lately. I’ll try to keep this blog more up to date going forward.

I recently wrote an article and was interviewed at the BioIT World Expo. This is now published in the May/June online edition of BioIT World Magazine (http://edition.pagesuite-professional.co.uk/launch.aspx?referral=mypagesuite&pnum=&refresh=fM1270nCZ30y&EID=e0620411-7193-4774-ae9b-a6b0781a1248&skip=).

In that article I make the points that in Next Generation Sequencing the rapid creation of new data continues, but also that the nature of the data workloads are changing at a much quicker pace than the IT infrastructure can keep up with. Bottom line, to not get stuck in a dead-end, users need to invest in storage solutions that architecturally are designed to be able to handle change.

Part of the forces driving the rapid changes is the increasing amount of local and remote collaboration that is happening, also cross disciplines. This is driving unpredictable combinations of data and a need for high speed remote collaboration.

Now we’re also embarking on a series of seminars to have in-person discussions about this, together with our partners BioTeam and Aspera.

The first seminar will be in Boston on July 12, 2011. If you want to participate, you can sign up here: http://www.bluearc.com/lifesciences-ma

The second seminar will be in New York on July 14, 2011. If you want to participate, you can sign up here: http://www.bluearc.com/lifesciences-nyc

We also plan to do more of these across the US and you’ll find information about this in our Events listings at www.bluearc.com, where you also can sign up to receive our newsletter.

Well, I was at the long awaited public Oracle/Sun strategy briefing yesterday. A rather long affair that certainly would have been enough time to cover all aspects of where the combined company is heading. They did a pretty good job of it. There were a lot of statements that basically said “we are investing in Sun’s product X and Oracle’s product Y continues to be the strategic direction”, i.e. a lot of “and” and not a lot of “but”. Especially during the software strategy talk. But despite this inclusive theme there were however some glaring oversights and a missed opportunity to provide clarity and state what they will NOT do.

As I told some people I met during lunch, it’ll be interesting to sit down later and ponder over what was NOT said or what was glossed over during the presentations and compare that with the statements that WERE made.

Being an HPC guy, my ears perked up when I heard the Lustre parallel file system mentioned as an example of an important open source project during Charles Phillips opening address. But as it turned out, that was the extent of telling us about the path forward with regards to HPC for Oracle/Sun. It was also the extent of Lustre directions. With nothing explicitly said about HPC, I and others are left to speculate and read between the lines.

What WAS said was that Sun’s x64 systems would be focused on integrated clusters for the enterprise. They emphasized “integrated” and “enterprise”. I guess you can interpret that in several ways, but to me that sounds pretty much like the Exadata system that was launched in the fall and very different from selling general purpose servers (that btw also can be used for HPC). Was that a bone thrown towards Dell?

Oracle’s On Demand centers use Dell servers and NetApp storage as far as I know. I can imagine these will be switched to Sun servers and storage going forward. NetApp got sort of a black eye when Larry Ellison positioned Sun’s ZFS storage appliance as a next generation NextApp, just better, faster and cheaper. There was no further reference to Dell however. The gloves never came off. A lot of Oracle software run on Dell hardware…

HP wasn’t mentioned much either, IBM was used for almost all competitive comparisons. I guess I’d put what happens with the Dell and HP relationships in the “glossed over” category.

Good to hear that they are hiring though. That message wasn’t glossed over.

Update: HPCwire made similar observation with regards to the future of HPC at Oracle/Sun

Unless you’re in a newly constructed data center I would argue that compute density isn’t the problem you should focus on. You won’t even have the power and cooling density to fully utilize the most dense systems out there.

[There are definitely exceptions to this, such as when you’re dealing with the maximum distance for your networking essentially defining for you the radius for the area where you need to fit your equipment.]

But for most HPC users, this isn’t the case. You’re not pushing the physical limits for electrical signals and your power and cooling are limited. If you’re in a data center built some years ago and you’re ready to upgrade to the next generation hardware, then you already can get more performance out of every rack unit than were the case when the data center was built. In other words, you probably have floor space to spare when you move to newer hardware.

So why would you pay extra for higher density?

My take on it is that unless you’re in the very high end of HPC or have some other very special reason to do it, you shouldn’t. Density is not the problem to focus on. Results per watt is.

If you follow that train of thought, and assume that you indeed have data center space to spare (or at least don’t need to reduce it), you first start to look at more generic servers that may or may not have more space in each box. You then distribute them more sparsely in the space you have. In one rack or multiple racks. Remember that you (usually) get more work done per box than you did with the last generation of hardware you installed.

Now, this may not meet your overall performance requirements. If so, it’s time to look at accelerators like GPU or FPGA and replace/complement your x86 servers with this. Depending on factors like your applications, if you have access to source code, if you have the skills to deal directly with FPGAs, etc – you’ll end up in your personal spot in this range of solutions. Nvidia for example has been working on this for a long time and have a nice set of both applications ready to take advantage of their Tesla GPUs and they also have good development tools that make it easy to just use it with an application or develop for it. Or, if you do have the skills to deal with FPGAs directly and have the volume and budget to support it, you could create a very specific accelerator for your needs.

The important thing is that by deploying accelerators like this you can address your overall performance requirements and still solve for “results per watt”.

At this point you have a so-called “nice” problem to contend with. This is where you need to decide if you want to get maximum performance out of the power/space/money budgets you have to work with or if you’re OK meeting a certain performance level and instead minimize the number of boxes you need to get there. I.e. do you exceed your performance target within your money/power/psace budgets or do you give something back from your budgets?

This morning I was reading John West’s article about Intel’s acquisition of RapidMind.  It’s the latest example of the High Performance Computing (HPC) industry recognizing the need to make the use of accelerators and many/multi-core and cluster parallelism easier, or to be specific with regards to the InsideHPC article, to design software for this.

I have always viewed the need to customize your software for specific accelerators to in most cases be a dead-end approach, be it GPU, FPGA or anything else. Granted that there’s a set of exception cases where developers and end-users are prepared to go down that route, fully aware of the costs. But to really reach the larger audience you need to make it much easier and essentially hide the complexity. History is littered with the remains of accelerator companies that never really solved that problem and only could take advantage of a limited window of opportunity.

I compare this with the times when I had to do assembly programming and count cycles to get that last ounce of performance that was needed in the embedded realtime systems I was working on while at Ericsson. In our case it made sense to do those time critical pieces at that low level, but for the most part we were using a high level language with built-in constructs for our most used and critical functions (realtime signaling and communicating over a high speed network designed for telecom and defense related applications). Only very few developers had to deal with the complexity of assembly level and really knowing what hardware was underneath. This approach greatly enhanced productivity when designing the actual applications and the performance was “good enough” so that we came out ahead every time.

I see many similarities with that and where the HPC industry has been with the use of accelerators and many/multi-core in parallel systems. It’s been a journey from having only those low level or hardware specific tools available for the really dedicated to where we now have several approaches to upleveling it to a point where the application developer can have essentially one source code and let the “system” take care of translating it in such a way that they take maximum (or close enough) advantage of the hardware it runs on.

Steve Wallach of Convex and Data General fame, now at Convey Computer, has said it very well: “The architecture which is simpler to program will win”

Apart from Intel/RapidMind; take a look at what Nvidia is doing with CUDA, OpenCL and integration with PGI compilers; what Convey Computer is doing with their HC-1 system; and for that matter what Apple and Microsoft are doing for promoting common API’s (OpenCL and DirectX Compute respectively)

We’re at an inflection point where the use of various type of accelerators now is easy enough for developers and we’re getting to a point where it’s also easy to deploy. Essentially providing “stealth acceleration” where it “just works” almost regardless of what hardware you have. This opens the door wide open for heterogenous clusters with Grid/Cloud level software that takes the pain out of scheduling for optimum time to results.

If I compare with my previous example of what we were doing for realtime networked applications, the next step would be a high level language that allows the developer to stay close to the application code and not worry about things like how to use MPI for best performance and scaling in a cluster. Sun’s Fortress language seemed to be addressing this in a way similar to what Java did for its space. However, with the Oracle acquisition of Sun you have to wonder if Fortress will survive? I’m hoping it will, as an open source project.

(For explanation of the “pig” reference in the title, please read to the end. It’s not meant to be negative)

I’m attending CloudWorld in San Francisco this week. Actually, it’s a collection of three conferences in parallel OpenSource World (formerly LinuxWorld), Next generation Data Center and CloudWorld. I’ve been focusing on the Cloud Computing side.

For someone like me with a background in HPC and Grid Computing (and distributed computing before the “Grid” term was invented), it’s a little like what Yogi Berra would have said: “This is like deja vu all over again.”. Cloud Computing is about making a big computer out of a bunch of smaller ones and giving access to this “service” over a network, often using modern web based portals and security mechanisms etc. Sound familiar?

When you poke under the hood, it looks eerily similar to Grid Computing. The “plumbing” is the same. It evolved out of trying to address the same problem: building that big computer out of distributed parts. So it’s no surprise that there are similarities. What’s different is the scale and the standardization at the different levels that is possible now with how the Web has evolved.

Some challenges still remain though and one big one is around culture. How do you gain the trust of a user so that s/he will trust the service enough to place her/his precious data in the cloud? This was the same problem we had with Grids. Things are however slowly changing. To some part people are getting used to having some of their data on the web, through their personal interaction with web commerce etc. But most people are still wary about putting all their information out there. The same applies to business, people are looking at what can be risked to be out there (even through so-called secure mechanisms) and what they aren’t ready to put to that risk (yet). It’s about gaining trust, and I would claim we’re not 100% there yet.

Bill Nitzberg from Altair made the connection with history in his talk and also with HPC as pioneering this technology. He made the observation that every decade since the 60’s have had it’s version of building a bigger computer out of smaller ones. Yesterday it was called Grid. Today it’s called Cloud. On the surface it looks like a “pig with a different snout” (a saying I picked up from listening to David Feherty and Nick Faldo during last weekend’s golf broadcast. I just love listening to their banter!). They are not quite the same thing though, the scale is different both in terms of number of machines that work together and also in terms of the scope of the problem being solved. “Tomorrow” we’ll talk about constellation of clouds and have a cute name for that.

Bottom line, if you have High Performance Computing or Grid background. You’re well positioned to understand the Cloud Computing issues and able to leverage your experience. I’d say this is very different from having a general enterprise data center background. In HPC and Grid you’re used to thinking about thousands of compute resources behind a virtualization layer (Grid), enterprise data centers (in general) don’t deal with that scale. In HPC we’re used to pushing the boundaries just a little beyond what’s comfortable in order to get that last ounce of performance.

What are you doing about it?

California is just a broken levee or another dry winter away from a full out water supply crisis. And that’s just the issue of having enough water for the bare necessities. That doesn’t count the growing dust bowl in the Central Valley that’s emerged since water supply to farming areas already has been cut. It doesn’t either include the environmental crisis in the delta that many say already is in full swing. If you take that into account, you could argue that we’re already in the middle of a crisis.

Yet, there doesn’t seem to be any awareness about this and most people (and local government) certainly don’t have the feeling of urgency to do something.  We can all pull our weight and make a difference with just a few simple changes. Actions like not letting the water run during tooth brushing, limit time in the shower, install water efficient toilets and washing machines are some of the common recommendations. You should certainly do all of that, but there’s a easier target to go after first. The biggest consumer of water for the average Californian family is their landscaping. The lawn alone takes 20,000 gallons per year. That’s about a whole swimming pool of water.

Long term you should phase out plants that require a lot of water to survive and move towards more drought resistant landscaping. In general, this tend sto mean moving towards native plants that already are adapted to the climate.

However, here’s a quick tip that most people can implement right away: Most irrigation system have a way to adjust for seasons. Our system allows us to to set a percentage of the times you’ve programmed for normal irrigation. Many (inlcuding yours truly) have defined “normal” as the level you need to get through the hot parts of the summer. This means that you expend too much water most other days.

If you instead set the normal level to be perfect for the more average summer days you should be fine most of the summer and just need to keep an eye out for the need to water just a little bit extra on the hottest days.

Instead of reprogramming your whole system, you can just use the seasonal adjustment to check what level works for you. In our case we put it at 70% when we normally would’ve turned it to 100% for the summer (of course, we already have it completely turned off during the winter and only gradually increase the run times through the spring)

This approach works very well for us, there’s been a couple of hot spells when I’ve had to manually water a little bit more to supplement the automatic system, but the lawn is definitely still green and doesn’t seem to be hurting at all.

If you figure that the manual add-on watering amounts to 10% (or less), i.e. that we in our case in reality run the system at 80%. Then this strategy still translates into saving at least 4,000 gallons of water for the average family.

Now for your local government, how about foregoing those lawns on the medians of the roads? Plant native plants instead to provide some greenery if necessary. And how about making sure the irrigation systems work as expected, are tuned to the right levels and don’t leak? Just a few blocks from where I live, there’s a leak that’s been there for more than a couple of weeks now…

After some time planning I’ve taken the plunge and started remodeling the upstairs of our home. I’m fully aware that even when starting small, a remodeling project often expands to include much more than you initially had intended. Knowing that, our plan is hopefully more realistic than last time we did it and includes quite a lot from the start.

In broad strokes, this is what we’re planning to do:

  • Convert a master bedroom den to a home office
  • Build a walk-in closet
  • Remodel the master bathroom
  • Update the guest bathroom
  • Make necessary electrical updates due to the remodel
  • Install a whole house fan system
  • Rip out the carpets and install hardwood floors

Where are we now with this project? Half of the new walls are framed and the space for the walk-in closet is gutted. Over the last couple of days I also took advantage of the slightly cooler weather and installed the first of 3 fans that go into the whole house fan system. It can get pretty hot in the attic on warm days so it’s necessary to do that work on the cooler days.

We live in Pleasanton, California and it can get pretty hot in the summer with averages around 90F/32C and peaks well above 100F and 40C.

We’ve had more or less a manual whole house fan system for the last 6 years or so. That’s how long it’s been since we decided to not use the old AC system anymore and “go green”. We’ve been using multiple fans to move cooler, fresh air into the house in the mornings and after sundown. It’s been working surprisingly well, with the exception of when we get multiple really hot days in a row. But it has been a chore to do it and sometimes you miss your window of opportunity, i.e. you sleep in and the day is already warm when you get up. So in our case, we’ve already made the energy savings by foregoing the AC several years ago and it’s now more about making it more convenient.

The system I chose is a QuietCool QC-1500, which I selected because it promises to be very quiet and also came with a wireless remote control (using Zigbee for the networking, maybe I can do more interesting things with that later?). The recommended size of system for our house had 3 fans to be installed in the attic. Actually, they seem to have changed the sizing chart now and maybe 4 fans would be more optimal according to their chart. But I’m betting they have changed it just to sell more fans. I can always install one more fan later if I’m wrong.

One QC-1500 whole house fan

One QC-1500 whole house fan

I ordered the system on line from A Trendy Home (they had a Father’s day sale) and it showed up after just a few of days. Yesterday I installed the first of the fans and have temporarily connected it with an extension cord to test it. Later I need to get an electrician to install a dedicated outlet for it.

I must say that so far it’s delivering as promised on most things. Even with just one fan operating for now there’s a definite breeze through the house and the remote control works in any part of the house. It’s not quite as quiet as I had expected though. There’s a faint low frequency, rumbling noise. It may just be that’s it’s a new sound because it’s definitely more quiet and less intrusive than the fans we were using before. I guess I may have had somewhat unrealistic expectations on how quiet it would be. Bottom line, and in anticipation of what the next 2 fans will do, I would still highly recommend it.

Going back to the HPC education issue I mentioned in my previous post, that actually touches on the other theme that was clear today at the sessions at the 23rd HPCC conference.

Seems like almost no one argues against that the future for continued increased performance in HPC is moving towards a future with more multithreading, multicore, many-core, GPUs and other accelerators, often in a heterogenous mix of thousands (or millions eventually) of each.

This is not a panacea however, there certainly are problems to be solved in those areas as well, the infamous memory wall and energy consumption would be two of them.

The biggest challenge in my mind though is on the software side. Our middleware, tools and applications are just not keeping up. We don’t have the software technology today that makes it easier to automatically take advantage of the inherent parallelism in the hardware infrastructure. We’re today edging into the Petascale era and providing essentially assembly level programming tools. That won’t work for the next level, closer to Exascale.

We need to invest in software that on one hand hides the underlying complexity and makes it easy to scale and on the other hand makes it possible to state the problem to be solved that is close to the natural representation of it. Much like Fortress allows mathematical notations to be used to easier represent equations. We need to bridge the gap between the domain knowledge that can describe the problem and the low level “magic touch” that is needed to get code to scale.

It’s not that this is new news. Many people have pointed this oput, but we don;t seem to make progress towards a solution. It’s not that we as a “collective ostrich” is hiding our head in the sand and hoping it will go away. It won’t.

The problem is that there’s no business case for a single software vendor to take on this huge challenge. This is an area that definitely requires government funding and industry wide attention.

One speaker suggested that HPC needs to be elevated to the same importance as a nationwide energy strategy. It’s that important. I tend to agree with him. We need to do whatever it takes to start to make progress in this area.

I also intend to continue twittering tomorrow under my Bearcrossings twitter id.