Category Archives: Uncategorized

Radom Streams of Consciousness during “USE IT OR LOSE IT” vacation: Remote PC is ^&%$!* AWESOME and why isn’t everybody talking about it?

So I am sitting at in the coffee shop at my hotel while I am on the Oregon coast using some of my “use it or lose it” vacation before the year ends. Last night, I literally had the greatest pint of beer in my entire life (India Pelican Ale from the Pelican Bar and Grill in Pacific City Oregon) and have just noticed that the same bar I drank it at is opening for Breakfast at 8 AM. I am wondering if I have a drinking problem because I have now rationalized that it is 11AM (early lunch) on the East coast and I could easily blame drinking this early on Jetlag. Like a lot of IT workaholics, I am really trying to get better at this whole “vacation” thing. At any rate, I thought I would sit down and read Jarian Gipson’s post on Remote PC and try to “get hip” to it as I myself am pretty excited about it.

For the record, I am NOT a VDI guy, that in and of itself is no longer badge of shame and it has been nice to see the virtualization community become more tolerant of those who are not jumping for joy over VDI. That said, I think VDI is totally cool but it is very hard to justify paying more for desktop delivery and trying to sell OPEX savings to CIO’s who worry about the next Quarterly stock call. Selling OPEX in the world of publicly held companies is a tough row to hoe. Then, in May, I read Jarian Gibson’s Blog about Remote PC to which I immediately asked “Can I haz it?”

Now I am excited, this works better than traditional VDI for SO MANY reasons, let’s take 1000 Flex/Teleworkers.

The 1000 Telworker Scenario:
Say you have to set up a telework solution for 1000 remote users. Typically this involves the procurement of 1000 laptops and sending the users home with them then building out the back end infrastructure to support either XenAPP or XenDesktop.

Sending users home with a Laptop and providing VDI Access:
So I am doing some brief estimating but I am assuming a laptop costs around $1000 and supplying 1000 end users with one puts in into the project a cool 1 million dollars out of the gate.

Project Cost so far: $1,000,000

Supporting the back end infrastructure:

A quick rough estimate of VDI memory/core requirements I would say you would need at least 20 servers to accommodate 1000 XenDesktop users. At around $10K per server you are looking at another $200,000 in back end hardware costs (not to mention licensing)

Project Cost so far: $1,200,000

So, in addition to the licensing debacle that you get to go through (one I have since thrown my hands up in disgust over) with Microsoft and the set up of the infrastructure you are 1.2 million into this deployment. We could switch to XenAPP (Now we’re talkin!) to save a little more. If you use VMWare (I don’t want to get involved in the hypervisor holy war) than you are going to have more cost as well.

So with XenAPP, I think you should be able to get by with 9 servers (30 users per 8GB VM w/2 vCPUs). At 9 servers you are looking at $90,000 and you are looking at round $1.09 million for your project. Nice savings but you are still stuck with building out the back end infrastructure.

Remote PC Scenario:

With the remote PC Scenario, we get a chance to actually take advantage of BYOD (bear with me here) and take advantage of the cheap PC models that are out there. We can replace the initial investment from a $1000 Laptop to a $400-$600 desktop (bulk PC purchases get this kind of leverage). This presents an opportunity to reduce that cost from $1 million to $400K-$600K. (Let’s use $500K as a baseline)

Now you’re talking the “language of love” to your CIO, CAPEX. Not only have you reduced the initial procurement costs but you do not need to build out the same amount of back end infrastructure. In the Remote PC scenario, you have your DDC’s brokering the connections but the XenAPP/XenDesktop farms are completely gone or maybe one or two XenAPP Servers for apps like ArcGIS, SAS and CAD.

I have spent hours extrapolating CPU Cores and RAM to try and come up with a user density, in all likelihood you have several thousand cores and several terabytes of RAM sitting at desks and in cubicles that can now be tapped into for remote access using Remote PC.

 RemotePC

Why this would work?

While working to set up the teleworking solution at a previous employer we noted a few things. First, after making a seven figure investment in laptops, we found that only 20% of them (that’s a generous number) actually connected to us remotely. The remaining users insisted on using their own equipment. Let’s take my case for example, at my desk at home, I have the “Crackpot Command Center” going with four monitors and a ball busting six core 16GB system (As any true geek would). So, when I want to connect to work, am I supposed to unplug everything and connect my keyboard, mouse and ONE MONTIOR (seriously?) to my Laptop? Maybe two monitors if I have a docking station? No freakin’ way!

Even non-geeks have their setup at home already and I doubt they have a data-switch box to switch back and forth so a teleworker can either work from the kitchen table OR they can UNPLUG their monitor and plug it into the docking station or laptop? The fact is, this is just not likely and the same end user would likely prefer to just use their equipment. This is something I witnessed first-hand to the complete shock of management.

In addition to the BYOD paradigm or UYOD (Use your own device) paradigm you also maintain support continuity. The first time we discussed VDI with our server group my team looked at me like I was crazy. First off, desktop management is 20 years in the making and there are old, established methods of supporting it. A complete forklift of the status quo is much more difficult than just provisioning desktops on the fly.

One of the issues with VDI was the inability to get people to understand that your 5 person Citrix team cannot support 10,000 desktops. Even more, they did not put 5 to 10 years into their IT careers to go back to supporting desktops. I personally am not overly excited to deal with Desktops after 16+ years in this industry and neither are most of the server admins I work with. The inability to integrate XenDesktop/View/ VDI in general, with the incumbent support apparatus at an organization is a significant, and in my opinion, often overlooked barrier to adoption. Your Citrix team likely is not THAT excited about doing it and the desktop team is completely intimidated by it. We go from imaging systems to the “Corporate Image” to setting up PVS, configuring PXE, DHCP Scopes, DHCP failover logging into the hypervisor….etc. Folks, it’s a Dekstop, the level of complexity to do a large scale deployment is far more advanced and much less forgiving than imaging laptops as they come in the door. Advances in SCCM integration for XenDesktop were very welcome and a timely feature but ultimately Remote PC delivers continuity of support as it is little more than an agent installed on the existing PC. The same people who support the PC today can continue to do so, server admins are not being asked to become Desktop Admins and the only thing that changes is that you are extending your infrastructure into the cloud by integrating the DDC and the Access Gateway allowing users the same consistent experience regardless of where they are working from.

You know what, I could BE a VDI guy if:

• I don’t have to put Windows 7 images on my million-dollar SAN (I LOVE Atlantis but it is not safe to assume your Citrix team can put the squeeze on your storage team)
• I don’t have to strong arm Server Admins to do Desktop Support
• I don’t have to buy a 2nd Windows License (or deal with Licensing)
• It can be made consistent enough that the incumbent Desktop team can support it

Holy crap! I’m out of excuses…I think I could become a VDI guy…

Hey CCS! I bet you can even install the Edgesight agent?! (They’ll get the joke) What’s not to like here? Yes, VMware, HP/Dell/Cisco might be a little bent for awhile since you won’t need as much hardware/Hypervisor software and Microsoft might find themselves chagrined as they cannot gauge you for more licensing costs but in the end, you get to simply extend your enterprise into the cloud without drastically changing anyone’s role. This also allows organizations to wade into VDI instead of stand at the end of the high dive at the public pool while Citrix, VMWare and Gartner chanted “JUMP, JUMP, JUMP!”

Isn’t that what we wanted when all this started?

Thanks for reading

John

About these ads

Preparing for life without Edgesight with ExtraHop

So, the rumors have been swirling and I think we have all come to the quiet realization that Edgesight is going to be coming to an end. At least the Edgesight we know and Love/Hate.

For those of us who have continued with this labor of love trying squeeze every possible metric we could out of Edgesight we are likely going to have to come to grips with the fact that the next generation of Edgesight will not have the same level of metrics we have today. While we all await the next version of HDX Edgesight with we can almost be certain that the data model and all of the custom queries we have written over the last 3 years will not be the same.

Let’s be honest, Edgesight has been a nice concept but there have been extensive problematic issues with the agent both from a CPU standpoint (firebird service taking up 90% CPU) and keeping the versions consistent. The real-time monitoring requires elevated permissions of the person looking into the server forcing you to grant your service desk higher permissions than many engineers are comfortable with. I am, for the most part, a “tools”-hater. In the last 15 years I have watched millions of dollars spent on any number of tools, all of which told me that they would be the last tool I would need and all of them in my opinion where, for the most part, underwhelming. I would say that Edgesight has been tolerable to me and it has done a great job of collecting metrics but, like most tools I have worked with, it is Agent based, also it cannot log in real-time. The console was so unusable that I literally have not logged into it for the last four years. (In case you were wondering why I don’t answer emails with questions about the console).

For me, depending on an agent to tell you there is an issue is a lot like telling someone to “yell for help if you start drowning”. If a person is under water, it’s a little tough for them to yell for help. With agents, if there is an issue with the computer, whatever that is (CPU, Disk I/O, Memory) will likely impact the agent as well. The next best thing, which is what I believe Desktop Director is using, is to interrogate a system via WMI. Thanks to folks like Brandon Shell, Mark Schill and the people at Citrix who set up the Powershell SDK. This has given rise to some very useful scripting that has given us the real-time logs that we have desperately wanted. That works great for looking at a specific XenApp server but in the Citrix world where we are constantly “proving the negative” it does not provide the holistic view that Edgesight’s downstream server metrics provided.

Proving the negative:

As some of you are painfully aware, Citrix is not just a Terminal Services delivery solution. In our world, XenApp is a Web Client, a Database Client, Printing Client and a CIFS/SMB client. The performance of any of these protocols will result in a ticket resting in your queue regardless of the downstream server performance. Edgesight did a great job of providing this metric letting you know if you had a 40 second network delay getting to a DFS share or a 5000ms delay waiting for a server to respond. It wasn’t real-time but it was better than anything I had used until then.

While I loved the data that Edgesight provided, the agent was problematic to work with, I had to wait until the next day to actually look at the data, unless you ran your own queries and did your own BI integration you had, yet another, console to go to and you needed to provide higher credentials for the service desk to use the real-time console.

Hey! Wouldn’t it be great if there were a solution that would give me the metrics I need to get a holistic view of my environment? Even better, if it were agentless I wouldn’t have to worry about which .NET framework version I had; changes in my OS, the next Security patch that takes away kernel level access and just all around agent bloat from the other two dozen agents I already have on my XenApp sever. Not to mention the fact that the decoupling of GUIDs and Images thanks to PVS has caused some agents to really struggle to function in this new world of provisioned server images.

It’s early in my implementation but I think I have found one….Extrahop.

Extrahop is the brain-child of ADC pioneer Jesse Rothstein who was one of the original developers of the modern Application Delivery Controller. The way Extrahop works is that it sits on the wire and grabs pertinent data and makes it available to your engineer and, if you want, your Operations staff. Unlike wireshark, a great tool for troubleshooting; it does not force you, figuratively, to drink water from a fire hose. They have formed relationships with several vendors, gained insight into their packets and are able to discriminate between which packets are useful to you and which packets are not. I am now able to see, in real-time, without worrying about an agent, ICA Launch times and the Authentication time when a user launches an application. I can also see client latency, Virtual Channel Bytes In and Bytes Out for Printer, Audio, Mouse, Clipboard, etc.

(The Client-Name, Login time and overall Load time as well as the Latency of my Citrix Session)

In addition to the Citrix monitoring, it helps us with “proving the negative” by providing detailed data about Database, HTTP and CIFS connections. This means that you can see, in real-time, performance metrics of the application servers that XenAPP is connecting to. If there is a specific URI that is taking 300 seconds to process, you will see it when it happens without waiting the next day for the data or having to go to edgesightunderthehood.com to see if John, David or Alain have written a custom query.

If there is a conf file that has an improper DNS entry, it will show up as a DNS Query failure. If your SQL Server is getting hammered and is sending RTOs, you will see it in real-time/near-time and be able to save yourself hours of troubleshooting.

(Below, you see the different metrics you can interrogate a XenApp server for.)


Extrahop Viewpoints:
Another advantage of Extrahop is that you can actually look at metrics from the point of view of the downstream application servers as well. This means that if you publish an IE Application and it connects to a web server that integrates with a downstream database server you can actually go to that web server you have published in your application and look at the performance of that web server and the database server. If you have been a Citrix Engineer for more than three years, you should already be used to doing the other team’s troubleshooting for them but this will make it even faster. You basically get a true, holistic view of your entire environment, even outside of XenApp, where you can find bottlenecks, flapping interfaces and tables that need indexing. If your clients are on an internal network, depending on your topology you can actually look at THEIR performance on their workstations and tell if the switch in the MDF is saturated.

Things I have noted so far looking at Extrahop Data:

  • SRV Record Lookup failures
  • Poorly written Database Queries
  • Exessive Retransmissions
  • Long login times (thus long load times)
  • Slow CIFS/SMB Traffic
  • Inappropriate User Behavior

GEOCODING Packets:
Another feature I like is the geocoding of packets, this is very useful to use if you want to bind a geomap to your XenApp servers to see if there is any malware making connections to China or Russia, etc. (I have an ESUTH post on monitoring Malware with Edgesight.) Again, this gives me a real-time look at all of my TCP Connections through my firewall or I can bind it on a per-XenApp, Web Server or even PC node. The specific image below is of my ASA 5505 and took less than 15 seconds to set up (not kidding).

On the wire (Extrahop) vs. On the System (Agent):
I know most of us are “systems” guys and not so much Network guys. Because there is no agent on the system and it works on the wire, you have to approach it a little differently and you can see how you can live without an agent. Just about everything that happens in IT has to come across the wire and you already have incumbent tools to monitor CPU, Memory, Disk and Windows Events. The wire is the last “blind spot” that I have not had a great deal of visibility into from a tools perspective until I started using Extrahop. Yes there was wireshark but for archival purposes and looking at specific streams are not quite as easy. Yes, you can filter and you can “flow TCP Stream” with wireshark but it is going to give you very raw data. I even edited a TCPDUMP based powershell script to write the data to SQL Server thinking I could archive the data that way. I had 20GB of data inside of 30 minutes, with Extrahop you can actually trigger wire captures based on specific metrics and events that it sees in the flow and all of the sifting and stirring is done by Extrahop just leaving you to collect the gold nuggets.

Because it is agentless you don’t have questions like “Will Extrahop support the next edition of XenAPP?” “Will Extrahop Support Windows Server2012″ “What version of the .Net Framework do I need to run Extrahop” “I am on Server Version X but my agents are on version Y”

The only question you have to answer to determine if your next generation of hardware/software will be compatible with Extrahop is “Will you have an IP Address?” If your product is going to have an IP Address, you can use Extrahop with it. Now, you have to use RFC Compliant protocols and Extrahop has to continue to develop relationships with vendors for visibility but in terms of deploying and maintaining it, you have a much simpler endeavor than other vendors. The simplicity of monitoring on the wire is going to put an end to some of the more memorable headaches I have had in my career revolving around agent compatibility.

Splunk/Syslog Integration:
So, I recently told my work colleagues that the next monitoring vendor that shows up saying I have to add yet another console I am going to say “no thanks”. While the Extrahop console is actually quite good and gives you the ability to logically collate metrics, applications and devices the way you like, it also has extensive Splunk integration. If there are specific metrics that you want sent to an external monitor, you can send them to your syslog server and integrate them into the existing syslog strategy be it Envision, KIWI Syslog Server or any other SIEM product. They have a javascript based trigger solution that allows you to tap into custom flows and cherry pick those metrics that are relevant to you. Currently, there is a very nice and extensive Splunk APP for Extrahop.

I am currently logging (in real-time) the following with Extrahop:

  • DNS Failures (Few people realize how poor DNS can wreck nth-tiered environments)
  • ICA OPEN Events (to get logon times and authentication times)
  • HTTP User Agent Data
  • HTTP Performance Data

So if this works by monitoring the wire, isn’t it the Network team’s tool?
The truth is it’s everybody’s tool, the only thing you need the network team to do is span ports for you (then log in and check out their own important metrics). You can have the DBA log in and check the performance of their queries, the Network Engineers can log in and check jitter, TCP retransmissions, RTOs and throughput, the Citrix guy can log in and check Client Latency, STA Ticket delivery times, ICA Channel throughput, Logon/Launch Times, the Security team can look for TCP Connections to China, Russia and catch people RDPing home to their home networks and the Web Team can go check which user-Agents are the most popular to determine if they need to spend more time accommodating tablets. Everybody has something they need on the wire; I sometimes fear that we tend to select our tools based on what technical pundits tell us too. In our world, from a vendor standpoint, we tend to like to put things in boxes (which is a great irony given everyone’s “think outside the box” buzz statement). We depend on thought leaders to put products in boxes and tell us which ones are leaders, visionaries, etc. I don’t blame them for providing product evaluations that way, we have demanded that. For me, Extrahop is a great APM tool but it is also a great Network Monitoring tool and has value to every branch of my IT Department. This is not a product whose value can be judged by finding its bubble in a Gartner scatter plot.

Conclusion:
I have not even scratched the surface of what this product can do. The triggers engine basically gives you the ability to write nearly any rule you want to log/report any metric you want. Yes, there are likely things you can get with an agent that you cannot get without an agent but in the last few years these agents have become a lot like a ball and chain. You basically install the appliance or import the VM, span the ports and watch the metrics come in. I have had to change my way of thinking of metrics gather from system specific to siphoning data off the wire but once you wrap your head around how it is getting the data you really get a grasp of how much more flexibility you have with this product than with other agent based solutions. The Splunk integration was the icing on the cake.

I hope to record a few videos showing how I am doing specific tasks, but please check out the links below as they have several very good live demos.

To download a trial version: (you have to register first)
http://www.extrahop.com/discovery/

Numerous webinars:
http://www.extrahop.com/resources/

Youtube Channel:
http://www.youtube.com/user/ExtraHopNetworks?feature=watch

Thanks for reading and happy holidays!

John


ICASTART, ICAEND “ICA-LIKE!!!”

In 2008 I had a conversation with Jay Tomlin asking him if he would put in an enhancement for ICA Logging on the AGEE. Basically we wanted the ability to see the external IP Addresses of our customers coming through the Access Gateway. As you are likely aware, what you get in the logs are the IP Addresses bound to the workstation and not the external IP Address that they are coming through. In the last ten years, it has become increasingly rare for an end user to actually plug their computer directly into the internet and more often, they are proxied behind a Netgear, Cisco/Linksys, and Buffalo switch. This makes reporting on where the users are coming from somewhat challenging.

Somewhere between 9.2 and 9.3 the requested enhancement was added and it included other very nice metrics as well. The two syslog events I want to talk about are ICASTART and ICAEND.

ICASTART:
The ICASTART event contains some good information in addition to the external IP. Below you see a sample of the ICASTART log.

12/09/2012:14:40:46 GMT ns 0-PPE-0 : SSLVPN ICASTART 540963 0 : Source 192.168.1.98:62362 – Destination 192.168.1.82:2598 – username:domainname mhayes:Xentrifuge – applicationName Desktop - startTime “12/09/2012:14:40:46 GMT” - connectionId 81d1

As you can see, if you are a log monger, this is a VERY nice log!! (Few can appreciate this) With the exception of the credentials everything is very easy to parse and place into those nice SQL Columns I like. If you have Splunk, parsing is even easier and you don’t have to worry about how the columns line up.

ICAEND:
The ICAEND even actually has quite a bit more information and were it not for the need to report ICA Sessions in real time, this is the only log you will need. Below is the ICAEND log.

12/09/2012:14:41:12 GMT ns 0-PPE-0 : SSLVPN ICAEND_CONNSTAT 541032 0 : Source 192.168.1.98:62362 – Destination 192.168.1.82:2598 – username:domainname mhayes:Xentrifuge – startTime “12/09/2012:14:40:46 GMT” – endTime “12/09/2012:14:41:12 GMT” – Duration 00:00:26 – Total_bytes_send 9363 – Total_bytes_recv 587588 – Total_compressedbytes_send 0 – Total_compressedbytes_recv 0 – Compression_ratio_send 0.00% – Compression_ratio_recv 0.00% – connectionId 81d16

Again, another gorgeous log that is very easy to parse and put into some useful information.

Logging the Data:
So, this was going to be my inaugural Splunk blog but I didn’t get off my ass and so my eval of Splunk expired and I have to wait 30 days to use it again (file that under “phuck”). So today we will be going over logging the data with the standard KIWI/SQL (basically a poor man’s Splunk) method.

So the way we log the data, if you haven’t been doing this already, is we configure the Netscaler to send logs to the KIWI Syslog server and we use the custom data source within KIWI to configure a SQL Logging rule. We then create the table, parse the data with a parsing script and voila, instant business intelligence.

Creating the custom KIWI Rule:

First, create the rule “ICA-START/END” with a descriptive filter configured as you see below.

Next you will optionally configure a Display action but more importantly you will configure the Script that parses the data.

Paste the following text (Below) into a file named Script_Parse_AGEE-ICA.txt and save it in the scripts directory of your KIWI install.

Function Main()

Main = “OK”

Dim MyMsg
Dim UserName
Dim Application
Dim SourceIP
Dim DestinationIP
Dim StartTime
Dim EndTime
Dim Duration
Dim SentBytes
Dim RecBytes
Dim ConnectionID

With Fields

UserName = “”
Application = “”
SourceIP = “”
DestinationIP = “”
StartTime = “”
EndTime = “”    
Duration = “”
SentBytes = “”
RecBytes = “”
ConnectionID = “”

MyMsg = .VarCleanMessageText

If ( Instr( MyMsg, “ICAEND_CONNSTAT” ) ) Then
SrcBeg = Instr( MyMsg, “Source”) + 6
SrcEnd = Instr( SrcBeg, MyMsg, “:”)
SourceIP = Mid( MyMsg, SrcBeg, SrcEnd – SrcBeg)

DstBeg = Instr( MyMsg, “Destination”) + 11
DstEnd = Instr( DstBeg, MyMsg, “:”)
DestinationIP = Mid( MyMsg, DstBeg, DstEnd – DstBeg)

UserBeg = Instr( MyMsg, “domainname”) + 10
UserEnd = Instr( UserBeg, MyMsg, “-”)
UserName = Mid( MyMsg, UserBeg, UserEnd – UserBeg)

StartBeg = Instr( MyMsg, “startTime “) + 11
StartEnd = Instr( StartBeg, MyMsg, ” “)
StartTime = Mid( MyMsg, StartBeg, StartEnd – StartBeg)

EndBeg = Instr( MyMsg, “endTime “) + 9
EndEnd = Instr( EndBeg, MyMsg, ” “)
EndTime = Mid( MyMsg, EndBeg, EndEnd – EndBeg)

DurBeg = Instr( MyMsg, “Duration “) + 9
DurEnd = Instr( DurBeg, MyMsg, ” “)
Duration = Mid( MyMsg, DurBeg, DurEnd – DurBeg)

SentBeg = Instr( MyMsg, “Total_bytes_send “) + 17
SentEnd = Instr( SentBeg, MyMsg, ” “)
SentBytes = Mid( MyMsg, SentBeg, SentEnd – SentBeg)    

RecBeg = Instr( MyMsg, “Total_bytes_recv “) + 17
RecEnd = Instr( RecBeg, MyMsg, ” “)
RecBytes = Mid( MyMsg, RecBeg, RecEnd – RecBeg)

ConBeg = Instr( MyMsg, “connectionId”) +12
ConnectionID = Mid( MyMsg, ConBeg)

Application = “NA”

end if

If ( Instr( MyMsg, “ICASTART” ) ) Then
SrcBeg = Instr( MyMsg, “Source”) + 6
SrcEnd = Instr( SrcBeg, MyMsg, “:”)
SourceIP = Mid( MyMsg, SrcBeg, SrcEnd – SrcBeg)

DstBeg = Instr( MyMsg, “Destination”) + 11
DstEnd = Instr( DstBeg, MyMsg, “:”)
DestinationIP = Mid( MyMsg, DstBeg, DstEnd – DstBeg)

UserBeg = Instr( MyMsg, “domainname”) + 10
UserEnd = Instr( UserBeg, MyMsg, “-”)
UserName = Mid( MyMsg, UserBeg, UserEnd – UserBeg)

AppBeg = Instr( MyMsg, “applicationName”) + 15
AppEnd = Instr( AppBeg, MyMsg, “-”)
Application = Mid( MyMsg, AppBeg, AppEnd – AppBeg)    

StartBeg = Instr( MyMsg, “startTime “) + 11
StartEnd = Instr( StartBeg, MyMsg, ” “)
StartTime = Mid( MyMsg, StartBeg, StartEnd – StartBeg)

ConBeg = Instr( MyMsg, “connectionId”) +12
ConnectionID = Mid( MyMsg, ConBeg)

EndTime = “NA”
Duration = “NA”
SentByes = “NA”    
RecBytes = “NA”

end if

.VarCustom01 = UserName
.VarCustom02 = Application
.VarCustom03 = SourceIP
.VarCustom04 = DestinationIP
.VarCustom05 = StartTime
.VarCustom06 = EndTime
.VarCustom07 = Duration
.VarCustom08 = SentBytes
.VarCustom09 = RecBytes
.VarCustom10 = ConnectionID

End With

End Function

Next you will create the custom DB format exactly as follows:
(IMPORTANT: NOT SHOWN Make sure you check “MsgDateTime” in this dialog box near the top)

Then you will create a new “Action” called “Log to SQL” and select the Custom DB Format and name the table AGEE_ICA and select “Create Table”. If you have not yet, build your connect string by clicking the box with the three periods at the top “…”

Then watch for ICASTART and ICAEND instances.

Then look at the data in your SQL Server:

Now you can report in real-time on external utilization by the following:

  • Utilization by IP Range
  • Utilization by Domain
  • Utilization by UserID
  • Utilization by time of day
  • Average Session Duration
  • You can tell if someone worked or not (“Yeah, I was on Citrix from 9AM to 5PM”)

Most of the queries you can reverse engineer from Edgesight Under the hood but if there is a specific query you are after just email me.

I get the average session duration with the following query:

select
avg(datepart(mi,cast([duration] as datetime)))
from syslog.dbo.agee_ica
where duration <> ‘NA’

 I tried to put everything in one table as you can see from the SQL Data Columns and the parsing script but you can split it up into separate tables if you want.

Thanks for reading!

John

Extending the Rudder

The challenges and benefits
of mobile devices in the enterprise.

 The last 18 months has witnessed a barrage of smart phones and tablets coming onto the market. While these devices score high marks for being “cool” I can honestly say, the INFOSEC pessimist in me says “Malware Vector” and the enterprise solutions person in me says I can put enterprise applications in the hands of key C-Level decision makers regardless of where they are. If deployed securely, mobile devices and smart phones could be the culminating of business agility we have all been working toward for years.

 Unfortunately, information security groups are rubbing their temples in the wake of a recent, and rather embarrassing, security breach with Apple’s iPAD product. A few weeks ago I bought a Chinese knock-off android tablet, after receiving it I connected it to my wireless network, brought up my Syslog server to watch PIX logs and within ten minutes, it was phoning home to a site in Japan! (I knew it!) We have also had stories of some of these smart phones being shipped with malware before they are ever handed to the end users.

In many organizations, prior to connecting to your network remotely, you are forced to undergo rigorous endpoint analysis to ensure that you have a proper and updated virus signature, a host based firewall, an approved build, encryption software, etc. Many INFOSEC groups kicked and fought for these policies in what has been described to me by my colleagues as just short of a bloodbath. For some IT shops, the blood of the remote access policy fight hasn’t even dried yet and if smart phone vendors think that enterprises will abandoned these polices to accommodate these devices they are delusional. At the same time, securing a smart phone may strip it down to the point that it is really no more valuable than the cell phone they have today.

I read the other day that Juniper is making a VPN client for Smart Phones. While I agree that Juniper VPN is a good product, I think it is risky to grant a VPN tunnel to any of these appliances. Why would a PC have to pass an endpoint scan and a smart phone not? Are they going to build smart phone endpoint scanners/agents?

 John, if you are so down on smart phones, why do you want to support them?
In our world, the end users do not exist because of us, we exist because of them. The title of this blog post was called “Extending the Rudder” and what I mean by that is that key decision makers in a company cannot be given too much agility. I am quite certain that Larry Ellison is not the CEO of oracle because he is the world’s best DBA. He is in that position because of his ability to steer the company and make critical decisions. Decisions are made through key metrics that are delivered to them via briefings, emails, etc. There is never too many ways to make this information available so long as you can keep it secure.

The mobile platform introduces the ability to take business agility to the next level and effectively “extend the rudder” to C-level and/or key decision makers in any organization. This goes beyond helping them look cool on the golf course. Products like SoftwareFX can deliver business intelligence reporting that is custom fit for a particular smart phone or device. The ability to deliver key metrics or enterprise applications to mobile users will make your organization more nimble AND look cool on the golf course.

Security Breaches:
There was a great article this week from Enterprise Mobile Today on the challenges of supporting mobile devices.  It also included a discussion on the security breach that occurred with Apple’s iPAD stating “Although the Valley Wag, the online publication that broke the story, implied that the breach was Apples responsibility, the issue was due to AT&T’s systems.”

Guess what, if there is a breach of corporate information on an iPAD issued by your company or agency, or you granted access to enterprise applications to an personally owned iPAD, it’s your responsibility. While Apple has restricted the use of middleware on its iPhone/iPAD applications, the other smart phone vendors may not. At issue here is the willingness to open up the OS on these devices to middleware while at the same time protecting the user and themselves from breaches. I know that Apple has taken a lot of flak for its policies on middleware and there is a big push to get them to back off on it. Either way, so long as these moving parts exists, there is a possible vector for malware, breaches and all around jackassery. There have also been concerns about the security of the Safari browser and opening up your ERP to a mobile device could mean exposing your infrastructure to an OS that currently has no enterprise virus scanning software and, in some cases, has applications installed on it that may carry malware themselves.
 

So how does thin computing get around this?
While I expect a lot of INFOSEC and IT Departments are going to say “No” when it comes to permitting the use of smart phones. I believe through thin computing via Citrix receiver and XenAPP or XenDesktop you can easily deliver safe and secure enterprise applications that will not run on the smart phone at all but rather on a locked down XenAPP Server or XenDesktop environment that only sends screen refreshes instead of full session traffic that can be sniffed or interpreted by a bot or malware.

Also noted in the article on Enterprise Mobile Today was the fact that several thousand email addresses were stolen as were some of their contact lists, including those of some high level government officials. Citrix has introduced an email client that has been optimized for mobile users. I highly recommend that you look at the session here: http://www.citrix.com/tv/#videos/2385

I think this product is fantastic and shows how organizations are going to have to ready themselves to securely deliver enterprise applications to mobile devices. In this scenario, the users email contacts and personally identifiable information exist on the exchange server and on the XenApp client that is run out of an ICA session. If the phone is lost, stolen, damaged or hacked, the information available on it is of no use as the crown jewels remain safe on the enterprises servers. Two factor authentication that is supported by the Citrix receiver and regular password reset regimen will help secure the end users credentials. All of these factors will allow systems administrators and INFOSEC types to have the freedom to innovate with this new technology.

The drawing below is an example of a VPN tunnel into an internal Network. In most cases, VPN appliances are installed with an “any any” rule allowing the clients to connect anywhere in the organization once the log in.

 

In this drawing we see how using the Citrix receiver is not a full VPN tunnel but an ICA Session that sends encrypted pixel refreshes to the end user instead of raw data. This means that if there is a zeus bot, or the like, on the phone looking for key html or xml such as “password” or “Card Number” it will not appear because the only data coming across is screen refreshes. This effectively keeps the data running on a restricted environment via XenAPP or XenDesktop.

 

 

Conclusion:
It appears as though the next technological line in the sand will be these mobile devices. The coming battle for superiority in this space will likely involve small OSes such as the Mac BSD hybrid OS and the Linux hybrid(s) that is on a lot of the ‘Droid series phones. These are very streamlined distro’s that you will not simply be able to install a complex anti-virus suit like McAfee or Symantec.  Also, I believe that the prevalence of these devices will only grow and they are upon us as Sys Admins whether we like it or not. As Citrix is basically present in nearly every large company, Citrix receiver, coupled with Access Gateway and SoftwareFX could put you and your team in a position to be able to accommodate this level of agility. Ensure that your INFOSEC teams understand the difference between an ICA Session and a VPN Tunnel, begin to educate decision makers on why we can make use of this technology for end users who are in the field and need this level of agility. Put yourself in a position to say yes, as it doesn’t take a great deal of innovation to say “no”.

God knows, I am hardly the gadget enthusiast, in fact I remember telling people that a phone was for talking on and nothing more but this new breed of smart phone and affordable tablets has me excited to see what we can do for our users in the field who, ultimately, pay all of our salaries.

Thanks for reading.

 John

Project Poindexter: (Non-Citrix Related) Grabbing Pix URL logs checking them for malware.

This is my first non-Citrix related post, I don’t plan on making it a habit but someone suggested that I post this in case it is valuable to other INFOSEC types. 

Let me start off by saying I am not a traditional security guy, I don’t have an abundance of hacking skills, I am not a black hat, white hat etc. I did work in Security for awhile as the Event Correlation guy for a year and have been trying to leverage digital epidemiology as a way to secure my systems. As I have stated in previous blogs, we have a better chance of curing the common cold than getting rid of malware and 0-day’s. In fact, I would say there are two kinds of systems, breached and about to get breached. This is the way you have to approach malware in my opinion. What surprised me with the Aurora breach was that it appears as though the INFOSEC community spends the lion’s share, if not all, of their time on ingress and completely ignores egress. When I look at the Google breach I see an attack that should have been mitigated within 24 hours.

Over the years I have deployed or viewed a number of event correlation utilities, most of them costing in excess of $250K for a large implementation.  What I generally did not like about shrink wrapped solutions and what I am most concerned about in the IT industry is the de-emphasis on heuristics and a dependance on an automated process to detect a problem.  In my opinion, an “Event Correlator” is not an appliance, it is an IT Person looking at a series of logs and events and saying “Holy shit! What the HELL is that!”.  The fact is, false positives make a lot of really expensive security software completely useless and a stored procedure or IDS/IPS cannot do as good of a job as a human being who can look at a series of logs and make an interpretation.  What I want to provide here is some of the heavy lifting that can then be use by a human to determin if there is an issue. 

The purpose of this post is to show people how I grabbed Syslog data from my pix allowing me to grab the URI Stem of all outgoing sessions and log them into a SQL Server. Afterward, I will be able to run key queries to be able to troll for .exe, .dll, .tgz and any other problem extensions. Also, I can upload the latest malware list data and cross reference it with the information in my database which will allow me to see if any of my systems are phoning home to a botnet master, malware distribution site, etc. This is basically a take on my edgesightunderthehood.com post on monitoring APT with Edgesight.

The first order of business is to get the logs to the syslog server. I start by creating a filter that will grab the logs. (See Below)

The next step is to parse the incoming data into separate columns in my database. This is done by setting up a custom db format for the purpose of these logs. The parse script is provided below:
Also, check all checkboxes below “Read” and “Write”

Parsing Script: (Cut and paste it to a text file then use that text file in the dialog box above)
################################
Function Main()
Main = “OK”
Dim MyMsg
Dim Source
Dim Destination
Dim Payload

With Fields
Source = “”
Destination = “”
Payload = “”    

MyMsg = .VarCleanMessageText

If ( Instr( MyMsg, “%PIX” ) ) Then
SourceBeg = Instr( MyMsg, “: “) + 2
SourceEnd = Instr( SourceBeg, MyMsg, “Accessed”)
Source = Mid( MyMsg, SourceBeg, SourceEnd – SourceBeg)
DSTBeg = Instr( MyMsg, “URL”) + 3
DSTEnd = Instr( DSTBeg, MyMsg, “:”)
Destination = Mid( MyMsg, DSTBeg, DSTEnd – DSTBeg)
End IF    
.VarCustom01 = Source
.VarCustom02 = Destination
.VarCustom03 = Payload

End With
End Function
##################################

The last step is to write the data to SQL but first let’s do a few tasks to prepare the table.

  1. Set up an ODBC connection to a SQL Server and create a database called “Syslog” and connect to it with an account that has dbo privilages.
  2. Create the Custom DB Format for grabbing URL’s

Note that this table will have five columns, msgdatetime, msghostname, msgtext, source, destination and payload. (The last column, payload, is not working yet but I will show you how to get the payload later)

3. Once this is done, create an action called “Write to SQL” and select “PIX_URL” from the custom data fromat list and name the table “PIX_URL” then select “Create Table”

Okay, so now that we have the data writing to SQL Server, let’s look at a month’s worth of data on one of my systems:

This query will give you the payload and the number of times the payload has been accessed. Using the having function I am going to ask for every uri-stem that has been accessed more than 5 times in the last month.

select substring(msgtext,41, 2048)as “Payload”, count(substring(msgtext,41, 2048))
from pix_url
group by substring(msgtext,41, 2048)
having count(substring(msgtext,41, 2048)) > 5
order by count(substring(msgtext,41, 2048)) desc 

The idea behind this is that if you note 1000 records to “123.123.123.123:/botmaster/botnet.exe” you may want to do something about it. You can also download the malwaredomainlist.com data, import it into SQL and cross reference that data to ensure that you are not communicating with any noted malware sites. Depending on the response of this blog, I may post those instructions as well.

 And here are what the results look like:

Another query I like to run is one looking for executable files in the URI-stem.

select Msghostname as “Firewall”, Source, Destination, substring(msgtext,41, 2048) as “Payload”
from pix_url
where msgtext like ‘%.exe%’
order by msgdatetime desc

This will allow me to troll for executables that my internal users are accessing, as with most versions of malware, this should show itself early on during the breach.

So how do you monitor?

Well, you don’t have to sit there with query analyzer open all day but you can set up SQL Server Reporting Services to do this chore for you and deliver a dashboard to operations personnel. Here is a quick view of a dashboard that refreshes ever 5 seconds and turns RED when “.exe” is in the URI-Stem. In this scenario, you would be able to investigate the executable that is being downloaded by the client and ensure that it is not malware. You can test this yourself once you set it up by going to any site and typing “/test.exe” at the end.


Conclusion:
Again, I am not a traditional security guy so this could be utterly useless, I am not the PIX guy at my job, I AM the PIX guy at home though. Also, I have found it very useful to check for Malware and 0-Day’s that my anti-virus does not pick up. While I cannot speak with as much authority as a number of CISSP’s and INFOSEC guru’s, I can say that the continued ignorance surrounding egress will allow malware to run amuck. As I stated in a previous blog, it is foolish to beat your chest at the millions of packets you keep out while the few that get in can take anything they want, and leave unmolested. Just like a store has to let some people in then focus on ensuring no one leaves with anything they didn’t pay for, IT Security needs to ease over to this mentality and keep track of what is leaving its networks and where it is being sent. At any rate, if this has value to anyone let me know, I will include the RDL (Report File) online for download if anyone wants to set it up. I know a lot of PIX guys aren’t necessarily web/database guys so if you have any questions, feel free to ask.

Thanks for reading,

John

Project Poindexter: Endpoint Analysis Log Harvesting

About four years ago management wanted to know which users were failing their endpoint analysis scans and to what extent we were compliant with endpoint analysis. We spent over $30K on a product called “Clear2View” and it did some rudimentary scans logging for us but the data was not very easy to query even though it was located in a SQL Database and the reporting features were, in my opinion, only so-so. With that, it appears as though Clear2View has gone away and many of us are left wondering how we will get our EPA Scan data on the new AGEE platform. We have been able to get past this dilemma by harvesting the Syslog Data from the AGEE and parsing it into a SQL Server and then integrating it with Business Intelligence.

As with other “Project Poindexter” posts, we will cover how to grab EPA Scan results from SYSLOG and write them to a SQL Server then report on them at a cost considerably less than $30K.

Materials:
Kiwi Syslog Server (Full version is $260 bucks)
SQL Server w/Reporting Services (You should already have if you have Edgesight)

Skills:
Some vbscript or parsing skills, although I will provide the parsing script to you.
The ability to take my SQL Syntax and edit it so that it suites your scans/environment.
The ability to upload an RDL to Reporting Services and map it to a data souce.

So getting started, here is an Example:
So, at home with the VPX and some test vm’s I set up the following scans:

As you can see, I am testing for the McAfee suite(a canned scan) and to see if the Windows Firewall is running.

Results: Here are the results that come into KIWI.

06-26-2010    12:16:05    Local7.Error    192.168.1.75    06/26/2010:11:41:06 GMT ns PPE-0 : SSLVPN CLISEC_EXP_EVAL 104254 : User wireless: – Client IP 192.168.1.50 – Vserver 192.168.1.100:443 – Client security expression CLIENT.SVC(MpsSvc) EXISTS evaluated to FALSE(3)

06-26-2010    12:16:05    Local7.Error    192.168.1.75    06/26/2010:11:41:06 GMT ns PPE-0 : SSLVPN CLISEC_EXP_EVAL 104253 : User wireless: – Client IP 192.168.1.50 – Vserver 192.168.1.100:443 – Client security expression CLIENT.SVC(MCVSRte).VERSION == 9.0.0 -frequency 5 evaluated to FALSE(3)

06-26-2010    12:16:05    Local7.Error    192.168.1.75    06/26/2010:11:41:06 GMT ns PPE-0 : SSLVPN CLISEC_EXP_EVAL 104252 : User wireless: – Client IP 192.168.1.50 – Vserver 192.168.1.100:443 – Client security expression CLIENT.APPLICATION.AV(McafeeVirusScanEnterprise).VERSION == 7.0 -frequency 5 evaluated to FALSE(3)

06-26-2010    12:16:05    Local7.Error    192.168.1.75    06/26/2010:11:41:06 GMT ns PPE-0 : SSLVPN CLISEC_EXP_EVAL 104251 : User wireless: – Client IP 192.168.1.50 – Vserver 192.168.1.100:443 – Client security expression CLIENT.APPLICATION.AV(McafeeVirusScan).VERSION == 7.0 -frequency 5 evaluated to FALSE(3)

06-26-2010    12:16:05    Local7.Error    192.168.1.75    06/26/2010:11:41:06 GMT ns PPE-0 : SSLVPN CLISEC_EXP_EVAL 104250 : User wireless: – Client IP 192.168.1.50 – Vserver 192.168.1.100:443 – Client security expression CLIENT.APPLICATION.AV(McafeeNetshield).VERSION == 7.0 -frequency 5 evaluated to FALSE(3)

So next let’s take these results and get them parsed then logged to SQL Server:

Create a new Rule called “EPA Scans” and create one filter with three actions.
The First Filter is called “Filter Text – CLISEC” and set it up to filter message text for “CLISEC”
The first Action is “DISPLAY”
The second Action is “Parse Data” (Note Check all the boxes for Read and Write and Browse to the location of the Parsing Script which you can get at http://www.ctxsupport.com and go to the “ACCESS GATEWAY forum)

The third Action is called “Write to SQL” which will require a custom data format so let’s cover those steps:

Custom Data Format:
Create a custom DB Format called EPA_SCANS, it should appear as follows: (Note the Field names AND the data types as they are very important)

Now that you have created your custom DB format go back to your “Write to SQL” action

Make sure that your DNS Connect String is correct and make sure that you name the table EPA_SCANS under database table name and that you use the Custom DB Format EPA_Scans then click on “Create Table”

Once this is done you should be all set, log into your VPN/AGEE Address and look for the results by running a simple SQL Query:

select * from epa_scans
order by msgdatetime desc

You should see something like the following:

Note that in the results I include 7 columns. I always include the entire log in the msgtext column for several reasons, among them Security statutes may dictate that you must have all of the log available and there are instances where parsed logs are not admissible in court. For this endeavor, it is your choice, I have habit of just leaving it in.

Also, my goal of setting up the logging was so that the Service Desk staff could look at the results and tell the end users what the problem is. To deal with that issue let’s take a look at the actual scans:

CLIENT.APPLICATION.AV(McafeeNetshield).VERSION == 7.0 -frequency 5 CLIENT.APPLICATION.AV(McafeeVirusScan).VERSION == 7.0 -frequency 5 CLIENT.APPLICATION.AV(McafeeVirusScanEnterprise).VERSION == 7.0 -frequency 5 CLIENT.SVC(MCVSRte).VERSION == 9.0.0 -frequency 5
CLIENT.SVC(MpsSvc) EXISTS

As you can see from the scans above, a Level I engineer may not have a very easy time with this so we are going to change our SQL up a little bit so that we have a more friendly description of the scan so that when someone calls the helpdesk saying they cannot get to a resource due to a failed scan, the person on the phone with them can give them a clear explanation of what the issue is.

So let’s shake up our SQL just a little:

select msgdatetime, userid, clientip, scan=
    case Scan
    when ‘CLIENT.SVC(MCVSRte).VERSION == 9.0.0 -frequency 5′ then ‘Antivirus Service Check’
 
   when ‘CLIENT.APPLICATION.AV(McafeeVirusScanEnterprise).VERSION == 7.0 -frequency 5 ‘ then ‘Antivirus ENT.Version Check’
    when ‘CLIENT.APPLICATION.AV(McafeeVirusScan).VERSION == 7.0 -frequency 5′ then ‘Antivirus Std. Version Check’
    when ‘CLIENT.APPLICATION.AV(McafeeNetshield).VERSION == 7.0 -frequency 5′ then ‘Netshield Version 7 Check’
    when ‘CLIENT.SVC(MpsSvc) EXISTS’ then ‘Check Microsoft Firewall Service’
    end,
    Result
    from epa_scans
order by msgdatetime desc

WordPress has a habit of placing double quotes on single quotes so it is not likely you can just paste this into your query so I will include this in the Access Gateway area of http://ctxsupport.com. At any rate note the following:
We are taking the cryptic “
CLIENT.APPLICATION.AV(McafeeVirusScanEnterprise).VERSION == 7.0 -frequency 5″ Text and converting it into a more easily interpreted ‘Antivirus ENT.Version Check’Your SQL Query, and eventually your SQL Reporting services reports will appear as follows:

Also, your SQL Report will appear as follows:

Note that the failures are RED which will alert your staff and also note how much more logical and more intpretable the SCAN information is. You could also rig up a self service by providing a link on the scan sending the user to the place to either innoculate their system or instructions on how to turn on their Microsoft Firewall.

Again all parsing scripts, RDL’s and SQL Queries are located here

Why is this even important:
Well, as the security screw gets tighter and tighter more and more restrictions are going to be placed on both internal and remote access systems. It will be a disaster to deploy endpoint analysis on a large scale without being able to at least give the support staff the ability to tell the users why they did not get access to a resource. We plan on taking this to the next level and providing an HTML Injection rule so that when a user goes straight to Web Interface because they failed a scan, there is a popup button that tells them they failed with a URL to the report above letting them know what scan failed, and eventually, a hyperlink to take them to a remediation page (Be it instructions or updated signatures).

Also, I believe, there never was a Clear2View for the AGEE anyway so those of us with the AGEE version were kind of left out of that game. This process sets you up with all the business intelligence you need to support NAC-like endpoint analysis and also allows you to report on the level of compliance for your company or agency. Oh…and it only costs $260 bucks plus some time (which I understand is expensive)

IMPORTANT NOTE/DISCLAIMER:
Obiviously, Citrix will not support this but also, you WILL HAVE to be able to edit the SQL Statement both within the Query Analyzer AND the RDL file otherwiseyour report will not show proper data. You do need to have some SQL proficiency to pull this off but you do not have to be a full fledge DBA. If you are a parnter, this could be a very nice value-add for a customer if you have a few hours left in an engagement. It was not excessively difficult to do.

Also, I don’t run all of the scans that everyone else may or may not run. There may be an instance where a particular scan does not parse properly, if so, shoot me an email and I will see if I can’t figure it out.

As with the VPN Logging, I plan on producing a video walkthru of this entire task. I should have some head down time at the begining of Next month to walk through it.

This literally took 45 minutes to set up once I had the Parsing scripts and my SQL Figured out. If you run into a problem, feel free to shoot me an email.

Thanks for reading

John

Edgeisight Under the Hood: Part 2 (Will be moved to Edgesightunderthehood.com)

Okay, so in this blog posting I want to continue covering a few more views in Edgesight that I like to run ad hoc queries against.  Today’s view is called   vw_es_archive_application_network_performance.  This view provides information network delay, server delay, xenapp server, process name and downstream hosts that your XenApp servers communicate with.  I have used this table to check delays of the executables such as winlogon.exe to check delay between this process and our domain controllers.  I will cover checking delays by process name, xen_app server and downstream host.  

 The first part will be to demonstrate how to find Network and Server delay of specific downstream hosts as well as how to measure the average XenAPP Servers delay.  Then in the second part I want to answer one of the questions from the first posting.  

 Down Stream Delay:
I actually got to present on Edgesight during Synergy 2008 and one of the key points that I tried to drive home is how Edgesight helps you with the never ending B.S. Witch hunts that always seem to occur when someone’s application is “running slow on Citrix”.  I would say that less than 30 % of what I actually investigate ends up being an actual XenAPP issue.  I will go over a few ad hoc queries that will give you the average delay of your down stream hosts and will give you the average delay experienced by each XenAPP Server allowing you to see if you have a specific XenAPP box that may be having some issues.   

The first ad hoc query has to do with downstream hosts, this will return the downstream host and the Network/Server delay.  I have set this query to filter any downstream host that does not have at least 100 records and a server delay of at least 300 miliseconds.  You can edit/remove the “Having” clause to suit your environment.        

select distinct hostname, sum(network_delay_sum)/sum(record_count) as “Network Delay”, sum(server_delay_sum)/sum(record_count) as “Server Delay”
from vw_es_archive_application_network_performance
group by hostname
having sum(record_count) > 100
and sum(server_delay_sum)/sum(record_count) > 300
order by sum(server_delay_sum)/sum(record_count) desc 

 

In English: “Give me the Network and Server delay of every downstream host that has at least 100 records (packets?) and a server latency of at least 300ms” 

 XenAPP Server Delay: 
It is a good idea to monitor your XenAPP Server delay, this will tell you if there is a particular XenAPP Server that is having a layer 1 or layer 2 issue.  This is a quick query that will show you the average delay of your XenAPP Servers.   

select distinct machine_name, sum(network_delay_sum)/sum(record_count) as “Network Delay”, sum(server_delay_sum)/sum(record_count) as “Server Delay”
from vw_es_archive_application_network_performance
group by machine_name
order by sum(server_delay_sum)/sum(record_count) desc  

 

Note: You will also see “Edgesight for Endpoints” client data in this table as well.  

 

Executable  Delay:
This query shows the delay associated  individual executables.  You may check outlook.exe to see if you have a delay in a downstream Exchange server or, in my case, check winlogon.exe for delays to domain controllers.  

 select distinct exe_name, sum(network_delay_sum)/sum(record_count) as “Network Delay”, sum(server_delay_sum)/sum(record_count) as “Server Delay”
from vw_es_archive_application_network_performance
group by exe_name
order by sum(server_delay_sum)/sum(record_count) desc  

Session Statistics:
Last week I got a a question about session counts and I wanted to answer it in this post, here was the question: 

 ”I’m looking for a custom report showing the application usage (Published Apps, not processes) on a hourly, daily and monthly base and a custom report showing the concurrent sessions on a hourly, daily and monthly base.”  

The view I used for this was vw_ctrx_archive_client_start_perf declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
set @begin = ’00′
set @end = ’23′
set @app = ‘%Outlook%’
select convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′ as “Time”, count(distinct sessid)
from vw_ctrx_archive_client_start_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) = @today-1
and published_application like ‘%’+@app+’%’
group by convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′
order by convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′ 

 In English: Give me every application on an hourly basis for a specific application.  On this report substitute %APPNAME% for whichever app you want to see.  Note that this is an hourly report so the time format is set to 108.   

 Daily Application Usage:
In the same view I change the query above just a little to accommodate a query by day.

declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
set @app = ‘%Outlook%’
select convert(varchar(10),dateadd(hh,-4,time_stamp), 111) as “Date”, count(distinct sessid)
from vw_ctrx_archive_client_start_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
and published_application like ‘%’+@app+’%’
group by convert(varchar(10),dateadd(hh,-4,time_stamp), 111)
order by convert(varchar(10),dateadd(hh,-4,time_stamp), 111)

 Monthly Application Usage:
Depending on how long you have your retention set (min is 30 days) this query may or may not work for you but this is the number of unique sessions per application for a month.

declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
set @app = ‘%Outlook%’
select convert(varchar(7),dateadd(hh,-4,time_stamp), 111) as “Date”, count(distinct sessid)
from vw_ctrx_archive_client_start_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
and published_application like ‘%’+@app+’%’
group by convert(varchar(7),dateadd(hh,-4,time_stamp), 111)
order by convert(varchar(7),dateadd(hh,-4,time_stamp), 111)

Application Matrix:
SQL Server Reporting Services will let you create a matrix, these two queries are for daily and monthly which will let you sort as follows:

  Date 1 Date2 Date3 Date4 Date5
Outlook Count1 Count2 Count3 Count4 Count5
Word Count1 Count2 Count3 Count4 Count5
Oracle Financials Count1 Count2 Count3 Count4 Count5
Statistical APP Count1 Count2 Count3 Count4 Count5
Custom APP-A Count1 Count2 Count3 Count4 Count5

 

  This has been the report method that has made my management the happiest so I use the Matrix tool with SSRS as often as possible.  Remember, if you have Edgesight, you have SSRS and setting up reports is no harder than an Access Database.

Here are the queries

 

First The Daily Matrix:

declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
select convert(varchar(10),dateadd(hh,-4,time_stamp), 111) as “Date”, published_application, count(distinct sessid)
from vw_ctrx_archive_client_start_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
group by convert(varchar(10),dateadd(hh,-4,time_stamp), 111), published_application
order by convert(varchar(10),dateadd(hh,-4,time_stamp), 111), count(distinct sessid) desc 

Then the Monthly Matrix:
declare @today datetime
set @today = convert(varchar,getdate(),111)
select convert(varchar(7),dateadd(hh,-4,time_stamp), 111) as “Date”, published_application, count(distinct sessid)
from vw_ctrx_archive_client_start_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
group by convert(varchar(7),dateadd(hh,-4,time_stamp), 111), published_application
order by convert(varchar(7),dateadd(hh,-4,time_stamp), 111), count(distinct sessid) desc 

 Concurrent Session Statistics:
A colleague of mine, Alain Assaf, set up a system that gives you this info every five minutes and is almost in real time, go to wagthereal.wordpress.com to see it.  Keep in mind that Edgesight is not real time data so if you set up a private dashboard for it, you may have to wait for it to refresh. 

The vw_ctrx_archive_client_start_perf view appears to give us only start times of specific published applications.  Perhaps the most used view of any of my reports is vw_ctrx_archive_ica_roundtrip_perf.  For this set of queries, I will count concurrent sessions but I will also go into ICA Delay’s for clients in my last post on Edgesight Under the Hood:

I will try to answer the users question on concurrent sessions with three pretty basic queries for hourly, daily and monthly usage:

Hourly Users:
declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
set @begin = ’00′
set @end = ’23′
select convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′ as “Time”, count(distinct [user])
from vw_ctrx_archive_ica_roundtrip_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) = @today-3
group by convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′
order by convert(varchar(2),dateadd(hh,-4,time_stamp), 108)+’:00′

 

Daily Users:
declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
select convert(varchar(10),dateadd(hh,-4,time_stamp), 111) as “Date”, count(distinct [user])
from vw_ctrx_archive_ica_roundtrip_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
group by convert(varchar(10),dateadd(hh,-4,time_stamp), 111)
order by convert(varchar(10),dateadd(hh,-4,time_stamp), 111) 

 Monthly Users:

declare @begin varchar
declare @end varchar
declare @today datetime
declare @app varchar
set @today = convert(varchar,getdate(),111)
select convert(varchar(7),dateadd(hh,-4,time_stamp), 111) as “Date”, count(distinct [user])
from vw_ctrx_archive_ica_roundtrip_perf
where convert(varchar(10),dateadd(hh,-4,time_stamp), 111) > @today-30
group by convert(varchar(7),dateadd(hh,-4,time_stamp), 111)
order by convert(varchar(7),dateadd(hh,-4,time_stamp), 111)  

 Conclusion:  
For the most part, I have vetted all of these queries, you may get varying results, if so, check for payload errors, licensing, etc.  I would really like to see some better documentation on the data model, most of these were basically done by running the query and checking it against the EdgeSight canned reports to see if my SWAG about how they did their calculations was correct.  All of the queries I ran here I checked and looked to be accurate.  If you are going to bet the farm on any of these queries to the brass in your organization, vet my numbers….

My next post will deal with ICA latency and delay issues for individual users and servers.

Thanks for reading!

John

       

Xen and the art of Digital Epidemiology

In 2003 I started steering my career toward Citrix/VMWare/Virtualization and at the time, aside from being laughed at for running this fledgling product called ESX Server 1.51, most of my environment was Windows based. There were plenty of shrink-wrapped tools to let me consolidate my events and the only Unix I had to worry about was the Linux Kernel on the ESX Server. Now my environment has included a series of new regulatory framework (Sarbanes, CISP, and currently FIPS 140-2). What used to be a Secure Gateway with a single web interface server and my back end XenAPP farm now includes a Gartner leading VPN Appliance, Access Gateway Enterprise Edition, Load balanced(GSLB) web interface servers, an application firewall and XenApp servers hosted on Linux based XenServer and VMWare. So now, when I hear, “A user called and said their XenAPP Session was laggy where the hell do I begin? How do I get a holistic vision of all of the security, performance and stability issues that could come up in this new environment.

As a security engineer in 2004, I started calling event correlation digital epidemiology. Epidemiology is defined as the branch of medicine dealing with the incidence and prevalence of disease in large populations and with detection of the source and cause of epidemics of infectious disease”

I think that this same principal can be applied to system errors, computer based viruses and overall trends. At the root of this is the ability to collate logs from heterogeneous sources into one centralized database. During this series, I hope to go over how to do this without going to your boss and asking for half a million dollars for an event correlation package.

I currently perform the following with a $245 copy of KIWI Syslog Server:(Integrated with SQL Server Reporting Services)

  • Log all Application Firewall Alerts to a SQL Server and present them via an Operations dashboard This includes violation (SQL Injection, XSS, etc), Offending IP and Time of day.
  • Pull STA Logs and provide a dashboard matrix with the number of users, total number of helpdesk calls, percentage of calls (over 2.5% means we have a problem) and the last ten calls (Our operations staff can see that “PROTOCOL DRIVER ERROR” and react before we start getting calls. )
  • I am alerted when key VIP Personnel are having trouble with their SecurID or AD Credentials.
  • I can track the prevalence of any error, I can tell when it started and how often it occurs.
  • My service desk has a tracker application that they can consult when a user cannot connect telling them if their account is locked out, Key fob is expired or if they just fat fingered their password. This has turned a 20 minute call into a 3 minute call.
  • I have a dashboard that tells me the “QFARM /Load” data for every server refreshing every 5 minutes and it turns Yellow at 7500 and red at 8500 letting us know when a server may be about to waffle.

For this part of Digital Epidemiologist series I will go over parsing and logging STA Logs, why it was important to me and what you can do with them after getting them into a SQL Server.

Abstract:

A few y ears ago, I was asked “What is the current number of external vs internal users”. This involved a very long, complicated query against RMSummaryDatabase that worked okay but was time consuming. One thing we did realize was that every user who accessed our platform externally came through our CAG/AGEE. This meant that they were issued a ticket by the STA Servers. So we configured logging on the STA Servers and realized a few more things. We also got the application that they launched as well as the IP Address of the server they logged into. So now, if a user says they had a bad Citrix experience, we know where they logged in and what applications they used. While Edgesight does most of our user experience troubleshooting for us, it does not upload in real-time and our STA Solution does. We know right then and there.

By integrating this with SQL Server Reporting Services, we have a poor man’s Thomas Koetzing solution where we can search the utilization of certain applications, users and servers.

For this post we will learn how to set up STA Logging, how to use EPILOG from Intersect Alliance to write the data to a KIWI Syslog Server and then we will learn how to parse and write that to a SQL Server and use some of the queries I have included to gain valuable data that can eventually be used in a SQL Server Reporting Services report.

Setting up STA Logging:

Go to %systemroot%\program files\Citrix\system32 and add the following to the ctxsta.config file:

LogLevel=3
MaxLogCount=10
MaxLogSize=55 (Make sure this size is sufficient).

LogDir=W:\Program Files\Citrix\logs\

In the LogDir folder you will note that the log files created will be named sta2009MMDD.log

What exactly is in the logs:
The logs will show up in the following format: (We are interested in the items in bold where a parse script will pipe them into a database for us. )

INFORMATION 2009/11/22:22:29:32 CSG1305 Request Ticket – Successful. ED0C6898ECA0064389FDD6ABE49A03B9 V4 CGPAddress = 192.168.1.47:2598:localhost:1494 Refreshable = false XData = <?xml version=”1.0″?><!–DOCTYPE CtxConnInfoProtocol SYSTEM “CtxConnInfo.dtd”–><CtxConnInfo version=”1.0″><ServerAddress>192.168.1.47:1494</ServerAddress><UserName>JSMITH</UserName><UserDomain>cdc</UserDomain><ApplicationName>Outlook 2007</ApplicationName><Protocol>ICA</Protocol></CtxConnInfo> ICAAddress = 192.168.1.47:1494

Okay, so I have logs in a flat file….big deal!

The next step involves integrating them with a free open source product called “Epilog” by this totally kick ass company called intersect alliance (www.intersectalliance.com). We will configure epilog to send these flat files to a KIWI syslog server.

So we will go to the Intersect Alliance Download site to get epilog and run through the installation process. Once that is completed you will want to configure your epilog agent to “tail-and-send” your STA Log Files. We will do this by telling it where to get the log file and who to send it to.

After the installation go to START->Programs->Intersect Alliance-> Snare/Epilog for Windows

Under “LOG CONFIGURATION” For STA logs we will use the log type of “Generic” and we will type in the location of the log files and we will tell Epilog to use the format of STA20%-*.log

After configuring the location of logs and type of logs you will want to go to “Network Configuration” and type in the IP Address of your Syslog Server and select port 514 (Syslog users UDP 514).

Once done, go to “Latest Events” and see if you see your syslog data there.


Section III: KIWI SYSLOG SERVER

I assume that most Citrix engineers have access to a SQL Server and since Epilog is free, the only thing in this solution that costs money is KIWI Syslog Server. A whopping $245 in fact. Over the years a number of event correlation solutions have come along, in fact I was at one company where we spent over $600K on a solution that had a nice dashboard and logged files to a flat file database (WTF? Are you kidding me?!). The KIWI Syslog Server will allow you to set up ten custom database connectors and that should be plenty for any CItrix administrator who is integrating XenServer, XenAPP/Windows servers, Netscaler/AGEE, CAG 2000 and Application firewall logs into one centralized database. While you need to have some intermediate SQL Skills, you do not need to be a superstar and the benefits of digital epidemiology are enormous. My hope is to continue blog posts on how I use this solution and hopefully you will see benefits beyond looking at your STA logs.

The first thing we need to do is add a rule called “STA-Logs” and filter for strings that will let KIWI know that the syslog update is an STA Log. We do so by adding two filters. The first one is stating “GenericLog”

The second filter is “<Username>”. The two of these filters will match STA syslog messages.


Now that we have created our filters, it’s time to perform actions. There are two actions we want to perform. We want to parse the script (pull all of the data that was bolded from the log text above) and write that data to a table in a database. You add actions by right-clicking action and selecting “Add Action”

So our first “Action” is to set up a “Run Script” action. I have named mine “Parse Script”.

Here is the script I use to parse the data (Thank you Mark Schill (http://www.cmschill.net/) for showing me how to do this.)

The Script: (This will scrub the raw data into the parts you want, click “Edit Script” and paste).

##############################
Function Main()

Main = “OK”

Dim MyMsg

Dim Status

Dim UserName

Dim Application

Dim ServerIP

With Fields

Status = “”

UserName = “”

Application = “”

ServerIP = “”    

MyMsg = .VarCleanMessageText

If ( Instr( MyMsg, “CtxConnInfo.dtd” ) ) Then

Status = “Successful”

UserBeg = Instr( MyMsg, “<UserName>”) + 10

UserEnd = Instr( UserBeg, MyMsg, “<”)

UserName = Mid( MyMsg, UserBeg, UserEnd – UserBeg)

AppBeg = Instr( MyMsg, “<ApplicationName>”) + 17

AppEnd = Instr( AppBeg, MyMsg, “<”)

Application = Mid( MyMsg, AppBeg, AppEnd – AppBeg)

    
 

SrvBeg = Instr( MyMsg, “<ServerAddress>”) + 15

SrvEnd = Instr( SrvBeg, MyMsg, “</”)

ServerIP = Mid( MyMsg, SrvBeg, SrvEnd – SrvBeg)

End If

.VarCustom01 = Status

.VarCustom02 = UserName

.VarCustom03 = Application

.VarCustom04 = ServerIP

End With

##############################

Now that we can parse the data we need to create a table in a database with the appropriate columns.

The next step is to create the field format and create the table. Make sure the account in the connect string has DBO privileges to the database. Set up the custom field format with the following fields. Ensure that the type is SQL Database.


As you see below, you will need to set up an ODBC Connection for your Syslog Database and you will need to provide a connect string here (yes…in clear text so make sure you know who can log onto the syslog server). When you are all set click “Create Table” and click “Apply”


Hopefully once this is done, you will start filling up your table with STA Log entries with the data from the parse script.

I have included some helpful queries that have been very useful to me: You may also want to integrate this data with SQL Server Reporting Services and with that, you can build a poor man’s Thomas Koetzing tool.

Helpful SQL Queries: (Edit @BEG and @END values)

 

How many users for each day:(Unique users per day)

declare @BEG datetime
declare @END datetime
set @BEG = ’2009-11-01′
set @END = ’2009-11-30′
select convert(varchar(10),msgdatetime, 111), count(distinct username)
from sta_logs
where msgdatetime between @beg and @end
group by convert(varchar(10),msgdatetime, 111)
order by convert(varchar(10),msgdatetime, 111)

Top 100 Applications for this month:

declare @BEG datetime
declare @END datetime
set @BEG = ’2009-11-01′
set @END = ’2009-11-30′
select top 100 [application], count(application)
from sta_logs
where msgdatetime between @beg and @end
group by application
order by count(application) desc

Usage by the hour: (Unique users for each hour)

declare @BEG datetime
declare @END datetime
set @BEG = ’2009-11-01′
set @END = ’2009-11-02′
select convert(varchar(2),msgdatetime,108)+’:00′, count(distinct username)
from sta_logs
where msgdatetime between @beg and @end
group by convert(varchar(2),msgdatetime,108)+’:00′
order by convert(varchar(2),msgdatetime,108)+’:00′

Follow

Get every new post delivered to your Inbox.

Join 127 other followers