Pavleck.Net

Monitoring, Scripting, and other Technologies

Archive for the 'Powershell' Category


Not sitting on my laurels.

Posted by Jeremy D. Pavleck on 5th August 2008

It’s been awhile. I’ve actually been terribly busy at my current client, implementing and fine-tuning my Alert Resolution State notification workflow. I’m currently expanding it to hold a few dozen different teams, as well as creating PropertyBags to send performance data (Number of alerts changed per category, total time script ran, total alerts, etc) as well as addming more robust failure checks - that way it can also alert if it fails for any reason.

I’ve also been playing around with PowerShell and 37 year old code.

Super Star Trek - in PowerShell!

Super Star Trek - in PowerShell!

Yes, that’s Super Star Trek. Just a little time-waster I work on while I’m mulling over a problem or two. And huge thanks to Jaykul of course, for all of his Powershell knowledge. I can’t do it without him and the crew in #powershell!

Are you on Twitter? If so, be sure to follow OpsMgr to stay on top of the most recent SCOM posts out there! And while you’re at it, feel free to follow me as well - I can always use more friends.

Until next time.

Posted in Community, Powershell, SCOM | 2 Comments »

Setting agent failover servers & Switching SNMP device proxies

Posted by Jeremy D. Pavleck on 7th July 2008

By default, you can’t really specify a failover management server in OpsMgr. Why? Not really sure, though I think it’s a ploy to ensure you setup the OpsMgr Active Directory Integration, which will handle this for you.

No fret though, we can still do it - it’ll just take a little bit of actual effort.

First, we need to define our Primary and Failover management servers. This isn’t something you can just progmatically grab, so you’ll need to know the name yourself.

In my $PROFILE, I’ve set them to be defined to 2 variables with the following:

# Set Primary Management Server
$Primary_MS = Get-ManagementServer | ? {$_.Name -like "SERVERNAME*"}
# Set Failover Management Server
$Failover_MS = Get-ManagementServer | ? {$_.Name -like "BACKUPSERVER*"}

Now that we have that set, it’s simple to do the rest. First, lets grab all of the servers that don’t have a failover management server set.

$noFailoverSpecified = Get-Agent | ? {(!$_.GetFailoverManagementServers())}

What the above does is call the GetFailoverManagementServers() method on each agent. If they have a failover, it will return data and thus $True. If there aren’t any failovers, it will return nothing - which is the same as $False. So we look for all the ones that don’t return anything.

If you’re curious, you can see just how many servers are missing failovers with

$noFailoverSpecified.Count

- in my case it was 63.

Now, we just run a quick snippet that adds the failover server to the agent:

ForEach ($agent in $noFailoverSpecified) {
Set-ManagementServer -PrimaryManagementServer $Primary_MS -AgentManagedComputer $agent -FailoverServer $Failover_MS | Out-Null
}

That will crunch away as it’s doing it’s thing, we’re redirecting output to $null so we don’t have to see agents scrolling over and over. When it returns you to a prompt, you’re done. If you’d like to verify that you did indeed set all of the agents to have a failover, we can check real quick:

If ((Get-Agent |? {(!$_.GetFailoverManagementServers())}).Count -eq $null)
{
Write-Host "Every agent has a failover server, great job!" -ForeGroundColor Green
} else {
Write-Host "Looks like we missed some, try again!" -ForeGroundColor Magenta
}

And that’s that. All of your agents have a primary and failover server.

Screen shot of SCOM Command Shell showing steps to setup failover agents

But wait, you have a lot of remotely managed devices too? Monitoring SNMP on a bunch of different servers - what happens for that?

Well, we can’t setup a failover (From what I’ve seen, if I’m wrong please let me know) agent. But we can proactively write a script that will change the proxy agent on the devices, and run it as needed.

This was written in a response to this query on the newsgroups, and is only a cursory look into it. There may be other ways of doing this - and I’d love to hear it. As it stands, I’m not sure how to set them back to a management server as the monitor.

Firstly, we’ll have to pick an agent managed computer to use as the new proxy agent. You can’t use a management server for this, because they aren’t “Agent Managed” and you can’t use Set-ManagementServer because the devices aren’t “Remote Managed Computers”.  I have a seperate agent-managed server on my network I call “Timex” because it acts like a watcher node. So I’ll go ahead and use him.

$proxyAgent = Get-Agent |? {$_.Name -eq "timex.pavleck.net"}

Then gather a list of our current remotely managed devices

$remDevices = Get-RemotelyManagedDevice

Now just loop through it, setting the device to use the proxy agent we just instantiated:

ForEach($device in $remDevices) {
Set-ProxyAgent -ProxyAgent $proxyAgent -Device $device | Out-Null
}

That will loop through things changing the proxy server that it uses. When it’s done, we can verify it by running:

Get-RemotelyManagedDevice |? {$_.ProxyAgentPrincipalName -ne $proxyAgent.Name}

If it outputs nothing, then they’ve all been changed. Simple as that!

SCOM: Setting the proxy agent for a device via command shell

Posted in Command Shell, Powershell, SCOM, SCOM Snippets, SNMP | 2 Comments »

A ‘consoleless’ OpsMgr

Posted by Jeremy D. Pavleck on 23rd June 2008

In MOM 2005, virtually everything was a rule. A rule looked for an even in the event viewer, a line in a log file, a return code from a script, etc and fired off an alert (Or did another action). It was essentially ‘dumb’, because it had no idea whether or not if an even it raised was ever fixed. It just fired them off every time it saw it.

Enter OpsMgr 2007. It introduced us to an old concept of the ‘monitor’. The monitor is a multi-state event. It watches for multiple items; something will set a particular item into a failed or degraded mode, and there is a corresponding event that marked it as being healthy again. This is wonderful, as it helps minimize the amount of open alerts sitting in your system at any given time. Less open alerts means we have more relevant information to look at.

When it comes to core Windows monitors, it works beautifully and 100% of the time. If you cross a memory threshold, an event is created and an alert goes out (If you’ve set it up to alert). When the memory drops below this threshold, then the monitor marks that particular object as being in a Healthy state again and, if you’ve allowed it to, it auto-closes the alert.

When this doesn’t work beautifully and 100% of the time is when you need to rely on 3rd party agents and management packs. I’ll use the HP Management Packs as an example, because that’s what I’ve been facing recently.

The way OpsMgr knows about hardware events that happen on an HP machine is because the HP agents themselves will place an event in the Event Viewer and/or send an SNMP trap about it. Works flawlessly to create an event in SCOM about an unhealthy object. What doesn’t work perfectly is the corresponding event that marks that system as being healthy again.

The reason for this seems to depend on the exact configuration of a server, the version of the HP agents, and the actual event itself. If there is an event, such as a power supply failing, the log is populated and SCOM creates an event saying “Power Supply #1 degraded.”. When that power supply is replaced, it won’t necessarily auto-resolve the event, because instead of seeing “Power Supply #1 Healthy”, the HP agents might instead log “Power Supply (Serial number: FD30401104-P) Inserted into Bay #0″. The monitor isn’t looking for that, and so it isn’t aware that that is the corresponding ‘good’ event, and the event stays open.

So theoretically you could replace a failing piece of hardware, such as a Power Supply, which doesn’t auto-resolve and then in the future have that same PSU die, which won’t cause a new alert and literally leave you ‘powerless’ to know what is going on.

Now, in a normal deployment of OpsMgr this isn’t to large of a concern. There are always eyes on the console or emails being sent. Someone will see it, fix it, then ensure the event is closed.

The current situation I’m in, however, doesn’t work this way. SCOM is being used consoleless to monitor a group of monitoring tools. Essentially it’s here to keep ‘them’ honest, and to ensure there’s another level of defense to protect us and let us know when a failure has occurred.

Because of this, those slight discrepancies in the HP agents and the HP management pack aren’t acceptable. But OpsMgr really doesn’t have a way of being run without anyone paying attention to it - or does it?

It actually does. What I’ve setup at this site is a PowerShell script which runs every 4 hours and resolves all the open HP alerts.The HP Agents themselves will run a self-check every hour or so, and log that “Power Supply #1″ is still failed. Because we’ve already cleared that alert, SCOM will pick it up again and re-fire the event, the alert, and all that jazz. In essence, we’ve created a ‘nag’ feature in SCOM.

This is beneficial in our case, because the current setup of OpsMgr where I’m at is mainly there to watch the other monitoring tools. This ‘nag’ lets us know that the problem was either not taken care of, or was not alerted on - thus ‘keeping them honest’.

How we do all this is very simple - the OpsMgr Command Shell has almost everything we need.

We’ll use Get-Alert to bring back a list of all open HP events, and Resolve-Alert to close them, adding a comment that we automated this.

To find the HP alerts, we need to match against the MonitoringObjectFullName property inside the alert. Through trial and error, I noticed that every single HP object began with “HewlettPackard”. So we’ll match against that, picking all alerts that don’t have a resolution state of 255 (Closed).

From there, we cycle through the alert array, passing each one to Resolve-Alert, along with a -comment - in my case I used “Closed by Powershell - see (link) for more details” with a link to the internal Wiki.

And that’s really all that there is to it. Mind you, I’ve done a lot more in the script, as you’ll see below. It measures how long it took to bring up the alerts, counts how many were per severity, the repeat count, etc then creates a PropertyBag and submits all the information to OpsMgr for reporting. It then also logs it to the eventviewer.

Download SCOM-Resolve-HardwareAlerts.ps1

This script is best setup to run every 4 hours or so. It’s setup as a generic ‘timed script’ inside of SCOM. If you’d like more info on setting up SCOM to work with Powershell more properly, see Brian Wren’s post here.

Here’s the script:

# ==============================================================================================
#
# Microsoft PowerShell Source File — Created with SAPIEN Technologies PrimalScript 4.1
#
# NAME: SCOM-Resolve-HardwareAlerts.ps1
#
# AUTHOR: Jeremy D. Pavleck , JPavleck@GMail.com
# DATE  : 6/11/2008
#
# COMMENT: When run, will gather all open HP alerts and mark them as resolved, setting a user
#    defined comment as well. It will then log to the event viewer it has done so.
#
# NOTES: The "Object Name" we use to determine what rules we want to resolve comes from the
#    MonitoringObjectFullName field of Get-Alert.
#    Also, you’ll need to either set this command to start in your SCOM2007 dir (By default
#    C:\Program Files\System Center 2007 or edit Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Startup.ps1
#    in said directory and change the dot source reference from current directory to the complete path.
#
# When calling this from an OpsMgr scheduled command, use
# powershell  -PSConsoleFile "C:\Program Files\System Center Operations Manager 2007\Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Console.psc1" -command "& {C:\Script\Path.ps1}"
#
# ==============================================================================================
# Ensure that the OpsMgr snap-in is there
Get-PSSnapin -name Microsoft.EnterpriseManagement.OperationsManager.Client -ErrorAction SilentlyContinue
If (!$?) {
throw "OpsMgr Console not loaded - please run with -PSConsolfile ‘X:\Path\To\Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Console.psc1′"
} else {
# CHANGE THIS to match the path in your system. Or don’t.
. "C:\Program Files\System Center Operations Manager 2007\Microsoft.EnterpriseManagement.OperationsManager.ClientShell.Startup.ps1" # Load OpsMgr stuff.
}

# Create some counters
$iinfo = 0
$iwarn = 0
$ierr = 0
$icrit = 0
$iunk = 0

### Configuration Section ###
$objectName = "HewlettPackard" # All of the HP objects start with this
$comment = "Automatically Resolved via PowerShell" # Added to alert

# Create the SCOM Script API object, so we can shove this info into the database
$momapi = New-Object -comObject "MOM.ScriptAPI"

# Grab all alerts that match MonitoringObjectFullName and are not Closed
# Time the whole thing for no reason
$findAlertsTime = Measure-Command {
$openHPAlerts = get-alert | Where-Object {
($_.MonitoringObjectFullName -match $objectName) -and ($_.ResolutionState -ne 255)
}
}

# Let’s grab some stats about what we grabbed first, before we resolve them.
$openCount = $openHPAlerts.Count
$totalFindTime = ([datetime]($findAlertsTime.ticks)).ToString("HH:mm.ss")
# Create a property bag to hold values to send to SCOM
$pbag = $momapi.CreatePropertyBag()
$pbag.AddValue("Total_Open", $openCount)
$pbag.AddValue("Total_FindTime", $totalFindTime)

# Resolving them couldn’t be simpler
# Lets count the severities we’re clearing, though.
foreach($alert in $openHPAlerts)
{
switch ($alert.Severity) {
"Information"     {$iinfo++}
"Warning"         {$iwarn++}
"Error"            {$ierr++}
"Critical"        {$icrit++}
}
$progress += "Server: " + $alert.NetBiosComputerName + " - Rule ‘" + $alert.Name + "’ - Repeat Count: " + $alert.RepeatCount + " `n";
# $pbag.AddValue("AutoResolveFor", $alert.NetBiosComputerName)
resolve-alert -comment $comment -Alert $alert
}

$pbag.AddValue("Info_HP_Alerts", $iinfo)
$pbag.AddValue("Warn_HP_Alerts", $iwarn)
$pbag.AddValue("Err_HP_Alerts", $ierr)
$pbag.AddValue("Crit_HP_Alerts", $icrit)
# Submit property bag to SCOM
$momapi.Return($pbag)

# Log eventviewer event to let us know what we did
# Severities: 1 = Error, 2 = Warning, 4 = Informational - "Script Name", "Event ID", "Severity", "Description"
$momapi.LogScriptEvent("SCOM-Resolve-HardwareAlerts.ps1", 926, 4, "Successfully resolved " + $openCount + " alerts. `nReport:`n" + $progress)
 

Posted in HP, Powershell, SCOM | No Comments »

SP1 Gem: Finding rules running on remote agents

Posted by Jeremy D. Pavleck on 13th June 2008

Remember in MOM 2005 the ease at which you could find out exactly what rules were running on what agent? It was as simple as going to Administration > Computers > Agent-managed computers > RIght-click on the server you’re interested in, hit properties then the “Rules” tab and bam:

Showing rules deployed to an agent in MOM 2005

In OpsMgr, it’s not quite so east. Sure, there are a ton of powershell commandlets available to help you figure it out - I think. I haven’t quite managed to get the correct order of Get-Agent, Get-MonitoringObject, Get-MonitoringClass, Get-Monitor, Get-Rule to produce a successful listing of what runs on a particular agent. I’m fairly positive it has to be there, I just haven’t found it1. So it was a tad difficult to find out what was running on a SCOM2007 agent.

Well, until Service Pack 1 RTM came around, that is. Located in UpdateCDImage\SupportTools folder is a management pack called “Microsoft.SystemCenter.Internal.Tasks.mp”. Go ahead and import it into your OpsMgr installation. I’ll wait right here.

Finished? Excellent. Now go to the Monitoring tab of the OpsMgr console and select “Computers” under Monitoring. Under the Windows Computer Tasks on the right side, you should see 4 new ones.

New Tasks!

The new tasks are

  • Resubmit local cache state change events
  • Show Failed Rules and Monitors for this Object
  • Show Local Cache for State Change Events
  • Show Running Rules and Monitors for this Object

What we’re most interested in is the “Show Running Rules and Monitors for this Object” task.

Click on it, submit the task, wait for it to crunch for a little bit and then you’re presented with an excellent little screen like below:

Pretty nice, eh? A complete, valid XML document listing the rules & monitors (Called ‘Workflows’ in this case) running on the agent.

Now, let’s PowerShell that up a bit - you can pull out all that information from within the Command Shell, in a nice little function.

Oh also, while we’re on the subject, I’ve found the super super easy way to determine which agent a HealthService ID belongs to. I know my previous result used raw queries to the SQL database and all that jazz, but not this one. Ready for it? If you blink, you might miss it. Here it is!

(Get-MonitoringObject -id "HealthService ID Here").DisplayName

Yep, it’s that easy. The wonders of powershell, eh?

Anyway, to run this task on an agent from within powershell, we have to do a little more work, but it’s really not all that bad.

Function Get-ActiveRules has 2 arguments: -server and -location. Server is self-explanatory, and it’s written to not require a FQDN; if your server is MYSERVER01.midwest.dc02.company.com, you just need to use MYSERVER01. The second argument is a location and filename for the output XML file. If left blank, it defaults to making C:\$server-rules.xml.

The is virtually no serious error checking, and it just dumps the Task OutPut field to an XML file instead of doing any magic with it, so it’s nothing fancy - take it as it is.

Enjoy! Download Get-ActiveRules.ps1

## Get-ActiveRules grabs the workflows running on the specified server
function Get-ActiveRules ([string]$server, [string]$location) {
If (!$location) { $location = "C:\$server-Rules.xml" }
# Create the Task object
$taskobj = Get-Task | Where-Object {$_.Name -eq "Microsoft.SystemCenter.GetAllRunningWorkflows"}
# Make sure we have it, if not, the MP isn’t installed.
If (!$taskobj) {
Write-Host "Unable to find required monitoring tasks - MS System Center Internal Tasks MP needs to be installed." -ForeGroundColor Magenta;
break;
}
# Grab HealthService class object
$hsobj = Get-MonitoringClass -name "Microsoft.SystemCenter.HealthService"
# Find HealthService object defined for named server
$monobj = Get-MonitoringObject -MonitoringClass $hsobj | Where-Object {$_.DisplayName -match $server}
# Now actually proceed with the task. I have mine formatted like this version, but I’ve added some light
# error checking for the ‘public’ version.
#(Start-Task -task $taskobj -TargetMonitoringObject $monobj).Output | Out-File C:\$server-Rules.xml
$taskOut = Start-Task -Task $taskobj -TargetMonitoringObject $monobj
# See if it worked, if it did, export out the OutPut part and save as an XML file, then display some items.
If ($taskOut.ErrorCode -eq 0) {
[xml]$taskXML = $taskOut.OutPut
$ruleCount = $taskXML.DataItem.Count
Write-Host "Succeeded in gathering rules for $server" -ForeGroundColor Green
Write-Host "Currently $ruleCount rules active." -ForeGroundColor Green
Write-Host "Exporting to $location" -ForeGroundColor Green
$taskOut.OutPut | Out-File $location
} else {
Write-Host "Error gathering rules for $server" -ForeGroundColor Magenta
Write-Host "Error Code: " + $taskOut.ErrorCode -ForeGroundColor Magenta
Write-Host "Error Message: " + $taskOut.ErrorMessage -ForeGroundColor Magenta
}

} # End Get-ActiveRules
#######################

  1. If you happen to know, please tell me! []

Posted in Powershell, SCOM | No Comments »

SCOM One-Liners: Get the Host name of your remotely managed device.

Posted by Jeremy D. Pavleck on 5th June 2008

One of the nice things about SCOM2007 is the ease in which you can monitor and manage non-Windows devices (Even better with the dhjdhdhdhd addon). Being able to add your switches, routers and Unix devices makes for a more complete overview of the health of your system.

There is a downfall though, and that would be that although you can use Get-RemotelyManagedDevice in the Command shell to list the, umm, remotely managed devices, all it will return is the IP address.

Here’s a SCOM command shell one-liner that will use some .Net-Fu to reverse the IP for you, on the fly:

Get-RemotelyManagedDevice | ForEach-Object {Write-Host "IP $($_.Name) resolves to hostname $([System.Net.Dns]::GetHostByAddress($_.Name).HostName)" }

If you want, you can go a step further and add it to your profile (Or more preferably, a SCOM snippet file that I’ll be writing about later.) and pull it up anytime you want.

The Display-NetAgentsByHostName function has one parameter, -short. Run the default ‘long’ way, it will reverse the IP, list the client that is managing it, the management group and the health state.
With the -short switch, it simply outputs health state, host name and IP address.

It’s not perfect, and coult use some formatting, but it works as is, so I leave it up as an exercise to the reader to polish it some more ;)
Page formatting messing it up? I’ll fix it some day - until then, download Display-NetAgentsByHostName.ps1

function Display-NetAgentsByHostName([switch]$short)
{
function LookUp([string]$ip)
{
trap {
"Unable to resolve IP"
continue;
}
([System.Net.Dns]::GetHostByAddress($ip)).HostName
}
Get-RemotelyManagedDevice | ForEach-Object {
If (!$short) {
If ($_.HealthState -eq "Success") {
Write-Host ("IP Address $($_.Name) resolves to ‘$(Lookup($_.Name))’ - Managed by server: $($_.ProxyAgentPrincipalName.Split(’.;’)[0])" +
"in Management group: $($_.ManagementGroup) - Health State: $($_.HealthState)`n") -ForeGroundColor Green
} elseif ($_.HealthState -eq "Warning") {
Write-Host ("IP Address $($_.Name) resolves to ‘$(Lookup($_.Name))’ - Managed by server: $($_.ProxyAgentPrincipalName.Split(’.;’)[0])" +
"in Management group: $($_.ManagementGroup) - Health State: $($_.HealthState)`n") -ForeGroundColor Yellow
} elseif ($_.HealthSTate -eq "Error") {
Write-Host ("IP Address $($_.Name) resolves to ‘$(Lookup($_.Name))’ - Managed by server: $($_.ProxyAgentPrincipalName.Split(’.;’)[0])" +
"in Management group: $($_.ManagementGroup) - Health State: $($_.HealthState)`n") -ForeGroundColor Red
}
} else {
If ($_.HealthState -eq "Success") {
Write-Host "HEALTHY - Host: $(Lookup($_.Name)) - IP: $($_.Name)" -ForeGroundColor Green
} elseif ($_.HealthState -eq "Warning") {
Write-Host "HEALTH WARNING! State: $($_.HealthState) - Host: $(Lookup($_.Name)) - IP: $($_.Name)" -ForeGroundColor Yellow
} elseif ($_.HealthState -eq "Error") {
Write-Host "HEALTH ERROR! State: $($_.HealthState) - Host: $(Lookup($_.Name)) - IP: $($_.Name)" -ForeGroundColor Red
}
}
}
}

Posted in Powershell, SCOM, SCOM Snippets | No Comments »