Bookmarks for June 29th through July 26th

Interesting links for June 29th through July 26th:

Impact of Infrastructure Automation Tools on Monitoring

IT management teams in charge of monitoring the IT infrastructure (network, servers, applications, etc.) mostly have little insight into what it is that they monitoring. (obvious or shocking?)

The tools we use clearly indicate this fact. Monitoring tools all have some sort of “discovery” functionality to figure out, what is out there to monitor. More often than not, when we set discovery loose on the network using (SNMP, ICMP, etc.), and it finds out devices and network connections customers did not know that existed.

Server/application monitoring tools start their cycle by scanning ports to see which ones would respond, or by sniffing the traffic to figure out which servers are out there, which applications may be running on which servers, etc.

The process would not be much different if you were attacking the infrastructure to find a way in. (How many of you triggered security alerts when performing discovery?) We’re outsiders. In enterprise environment, we often don’t even know the owners/developers of the applications.

In monitoring field, this has been the norm for so long that it no longer bother us. It should. Of course, monitoring teams & tools don’t do this for fun. It has to be done because in most cases, there is no truth teller; no place to get this kind of information. It is not uncommon for the monitoring tools to feed data they discover to inventory tools etc.

There are efforts like CMDB projects that attempt to create a repository that provides this information to all management tools but these projects often run into organizational as well as technical obstacles, and things are getting harder by the day with the dynamism introduced by virtualization and the cloud technologies.

What if we didn’t have to do all this crap to know what’s what? What if monitoring tools could be told which applications run on which server, where that server is in the network, etc. ?

There is indeed a better way, at least for some use cases. Proliferation of infrastructure automation tools (aka configuration management tools) such as Chef (and the management APIs exposed by VMWare, etc.) have the potential to change not only for how we deploy and maintain servers and applications but also how we monitor them.

Most obvious impact is that using these tools mean monitoring tools can have a reliable source to learn about the infrastructure that should be monitored. What the role of the servers are, how they are configured, which application components run on which server, what the change history is, etc. This is a huge step forward.

When you know how things should be, it’s much easier to detect the exceptions. A significant portion of the problems happen due to changes somewhere in the infrastructure. Ability to automate changes, see the change history and roll back when needed is an invaluable. And being able to correlate the configuration changes with the monitoring data can significantly reduce troubleshooting time and hence improve availability.

Another impact is that a safe framework that enables operations folks to take actions to troubleshoot and resolve problems (combined with run book automation, workflow, wiki, etc.) may finally mean that level 1/2 support folks can do more than record and route without giving them full access to the systems (which is not feasible), reducing number of problems escalated to higher levels and increasing overall productivity.

I should state just for the record that I don’t mean that infrastructure automation tools like Chef introduce brand new technology. Opsware (now HP), BladeLogic (now BMC), ConfigureSoft (now VMWare) for server configuration management, and TrueControl (now HP via Opsware), Voyence (now EMC), AlterPoint for network configuration management have been around for some time. But confluence of factors such as success of the (Apache licensed) open source model of Chef, and increasing acceptance of cloud economics, and patterns such as availability of open APIs move Chef into the center stage. It does not take great wisdom to infer that price point such as $50/month for 20 devices will make Chef very hard to ignore. Price is indeed a feature.

Looking forward to see how infrastructure tools like Chef will evolve as they move further into the enterprise world. Monitoring folks need to pay attention.


Enhanced by Zemanta

Bookmarks for June 15th through June 28th

Interesting links for June 15th through June 28th:

Bookmarks for May 28th through June 2nd

Interesting links for May 28th through June 2nd:

Bookmarks for April 24th through May 25th

Interesting links for April 24th through May 25th:

SaaS model in IT Operations Management – is it in our future?

Is delivering IT operations management as “software as a service (SaaS)” a viable option?

I think it’s a question worth contemplating for anyone involved in IT Ops. Yes cloud hysteria is everywhere, and yes a lot of what’s going on is vendors rephrasing the same products with the latest buzzwords. Nonetheless, there are also signs of a major shift, that can potentially have a major impact on IT Operations Management.

There is no doubt that IT Ops will be drastically different going forward when organizations start using more and more “cloud” services but this is not the focus of this post. What I wanted to hash out is whether SaaS is a viable model for delivering IT management itself.  And even going further, whether it will become the dominant model in not so distant future.

Let’s start with a look at the current state of IT Operations Management first. Is there actually a problem that needsa solution? I’m pretty sure we all agree that there is indeed a problem.

Most organizations are stuck in the muck

Currently implementing just the base solutions take so much time and effort that only few organizations have the means and the will to proceed any further. There has been little innovation in the field and even the ideas and technologies that have been around for many years don’t get applied.

For example, let’s think of what it takes to implement and maintain an event management solution in a large network.IBM’s Netcool suite is widely accepted as the defacto standard product for event management and has a very large user base. Yet the solution has many moving parts:

  • Probes
  • Tiered Object Servers for aggregation, presentation, etc.
  • Bi-directional gateways for replication,
  • Webtop, TIP, etc. to provide web based UI
  • Reporter and Oracle for reporting

At least a dozen application processes. Just installing the right versions of the included software, avoiding compatibility issue and integrating the components is a major undertaking, let alone mastering how to develop solutions using them. This is just to consolidate events in a single repository, nothing advanced at all.

As a result of this complexity, highly skilled resources get bogged down implementing & maintaining the base solution, struggling to find the bandwidth to implement features/techniques that would truly add value: enrichment, automation, correlation, visualization, service management etc.

It is also very costly and difficult to build sophisticated solutions on top of such complicated and hard to maintain foundation. Hence organizations find it hard to show ROI, and justify any further investment. Solutions at best stagnate where they are, performing the bare minimum, and at worst they degrade in time to become eventually unusable.

So how may SaaS help ?

SaaS is not a magic bullet. If someone took the same product suites and attempted to provide as a service they would have little to no chance to succeed. The dominant products from Big 4+ vendors are quite old and not designed for the “cloud”. But a solution that is designed from ground up with new constraints and opportunities of introduced by cloud and other modern technologies may have a significant impact.

What if event management was available as a SaaS offering? An event management solution that has high availability, no scalability limitations, modern web based UI with impressive visualization capabilities, correlation using complex event processing techniques, workflow, integrated reporting etc. ? Even more, what if they also offered a development platform for others to build solutions as well, similar to SalesForces’s Force.com? Would it not change the entire landscape?

Such an offering can potentially solve majority of the problems, Ops organizations currently struggle with (the muck), freeing up their resources to move up the chain and tackle more value add projects. It would also potentially provide substanstial savings, making it quite attractive to business.

SaaS as a model has moved to mainstream. It is no longer necessary to explain to people what it is, why and how it provides value. Although it may not have been embraced by everyone in the enterprise world, there are signs that it may even becoming the preferred approach for many organizations.

And SaaS offerings have come to IT management as well. Service-now.com ITSM service is a stellar example of the power and potential of SaaS in IT management. It has already changed the ITSM landscape, forcing established players to scramble to offer their own solutions as SaaS.There are also already number of SaaS offerings in the market typically targeting SMBs. Can an event management solution for the large enterprises be far behind?

No doubt there are obstacles, both technical and organizational, that may hinder adoption of SaaS in IT Operations. Most obvious ones seem to be security concerns and integration, but are these show stoppers or just issues that need to be worked out? These concerns are valid for any application and although they are source of concern, they do not seem to hinder adoption of SaaS in other areas. Is there something that makes IT Operations Management so unique that it can be immune to the SaaS tidal wave?

One thing is for certain that if SaaS gains traction in IT Ops, our lives will never be the same! I think it is time to assess what the implications of SaaS may be and figure out what we need to do to surf the wave rather than getting swept by it.

What do you think? I would love to hear what your thoughts and compare notes…


Reblog this post [with Zemanta]

Bookmarks for April 20th from 08:47 to 14:43

Interesting links for April 20th from 08:47 to 14:43:

  • Easy VMware Development with VI Java API and Groovy | virtual insanity – RT @aalmiray Easy VMware Development with VI Java API and Groovy http://bit.ly/8YXPQF
  • dy/dan » Blog Archive » My TEDxNYED Session — Math Curriculum Makeover – great short presentation from Dan Meyer @ddmeyer about how to change the math curriculum http://blog.mrmeyer.com/?p=6548
  • Steve Blank On How Startups Evolve Into Large Companies – According to Blank, the line he draws between smaller startups and larger companies is based around the business model. Startups, he says, exist in the state where they are searching for a business model, and large companies are the result of finding and executing that business model. The reason he calls out accountants in the title of his talk is that as startups transition into larger companies, their less conventional methodologies become more traditional, and that's when accountants are needed
  • Steve Blank On How Startups Evolve Into Large Companies – According to Blank, the line he draws between smaller startups and larger companies is based around the business model. Startups, he says, exist in the state where they are searching for a business model, and large companies are the result of finding and executing that business model. The reason he calls out accountants in the title of his talk is that as startups transition into larger companies, their less conventional methodologies become more traditional, and that's when accountants are needed

Bookmarks for April 9th through April 19th

Interesting links for April 9th through April 19th:

Bookmarks for April 6th through April 8th

Interesting links for April 6th through April 8th:

Bookmarks for March 20th through April 1st

Interesting links for March 20th through April 1st:

Next Page »