Monitoring Those Hard to Reach Places: Linux, MySQL, Oracle, Java and More

Monitoring Those Hard to Reach Places: Linux, MySQL, Oracle, Java and More


[MUSIC] One of the topics that comes up
frequently with IT professionals is well, SolarWinds monitoring is good for
the run of the mill stuff but you really don’t handle the special
things, the outliers like, and then they name something that’s
totally common that just happens to not be Windows® or Exchange® or
Microsoft® SQL or something like that. Yeah, right. As if they’re the only company that
we ever met running in Informix® or custom Java or MySQL® on Linux®. Uh-huh. Yeah and it’s not that those
things are particularly unique, it’s just that traditional
monitoring vendors tend to overlook them unless they have
a specialized app for it. Right, so, in this session, what we’re
gonna do, is address monitoring those, hard to reach place, ya know, that
spot like right back there. Anyway, which might not be
uncommon in your enterprise but which we hear about frequently is
sources of monitoring frustration. I’m Leon Adato, SolarWinds Head Geek, and with me today is Head
Geek Patrick Hubbard. and also Database Performance
Evangelist Janice Griffin. Hey. Welcome to THWACKcamp. So we’re just gonna show of the SolarWinds
tools that just monitor this stuff, right? We’re just gonna run down our stuff? Oh, no, not at all. As a Senior DBA, and a longtime Unix and Linux administrator, I can tell you there are so many tools out there. That you can see the current state of
these systems and or applications. In fact, many of the systems come
with the tools already built in Yeah, and we’ve made this point before, both on SolarWinds Lab and
last year at THWACKcamp 2015, is the differences between
those built-in tools and third-party tools usually come in
things like granularity of measurement, persistence, and then the level of action
that you can take on that information. Right, good points. Okay, so what I’d like to do then is
break down each section by first showing the commands or built in options that
are available within the system or the application or the tool. Native tools. Native commands even. Then we’re gonna show another
option which works well but not might be our top choice. And finally, we’re gonna show the best
of breed that really gets you everything that you need. So how does that sound? That sounds good but this is also gonna give you a chance
to totally geek out on Linux. Yeah, that was part of my plan. Okay, well,
then I might have talked about Java. Oh, we’ll let you. Let’s talk about databases. Fair is fair. Wonderful.
Alright, so let’s get started. Alright, Leon the first hard to reach
place, not that I’m picking on you, Linux. Okay. Or maybe not really. Maybe not really? No, no, no really Linux, really Linux. Okay, so I am gonna jump in here. Okay, okay. So again, we’re gonna start
off with some built-in commands. And Linux has a plethora of them. I’ve got a screen up here,
good old lab Ubuntu system. Mm-hm. Of course, I wouldn’t do anything else. And so,
a few different commands are built-in. The first one are some
vaguely graphical commands. If you just have eyeballs on a screen,
for example, you’ve got top. Top is gonna give you just what’s
running at the top of the processes. It’s gonna give you the amount of memory,
and CPU that’s being used. Process monitor, command line for
life. Right, and it’s built in. Now there’s a few things that
are a little more built in. For example, you have glances which is
a little bit more colorful version of top, does pretty much the same thing. But it gives you a little
bit more of a highlight. If you have errors down at the bottom,
you’re gonna end up seeing some memory or CPU errors that come up. And then you have something that’s a
little bit more advanced than that, htop. Which gives you again the top processes,
but now you see things with a little
thermometer bar at the top. You get memory and
swap get called out graphically. And you got menus down at the bottom so
you can do searches and filtering. [CROSSTALK] kill people too as well. And you can kill sessions, not people. We do not kill people.
[LAUGHTER] We kill processes. That’s what we do here. [LAUGH] So you’ve got that as well. So these are some fairly easy commands. But they do require that you have
eyeballs staring at screens to know. Even though you can do these
remotely and things like that. There are also some built in commands
that are good for at the moment, like what’s happening. SAR or systat depending on what version
of Linux or UNIX you’re using is there. So if I type for example SAR give me
the CPU values, gives me three of them. So it’s now checking the CPU
three times to go look at that. I can do the same thing for memory. And this will give me
my memory information. So, I can use SAR or SYSTAT. Now that’s important, because we’re
gonna actually come back to that. Okay. So, remember that, folks. So, that is some of
the built-in commands and also some of the utilities
that you’ve got along the way. And handy for debugging one machine. Right, handy for debugging one
Again you could script things, but really there’s better ways to do this, obviously. So let’s go ahead and
get out of this telnet session, and let’s go ahead and take a look. Now I’ve got the same
system up in monitoring and one of the overlooked options that
a lot of people don’t realize is that if you have SAM, you have the real-time
process monitor, it’s still real-time. You’re still eyeball staring at screens. And unlike the Windows monitor, it’s not gonna hammer the system right
off the bat because, like with top, you could actually have hundreds and
hundreds of processes. So here it’s telling you
how many are running. If you click “Show All” you get all of them,
cause that’s, sometimes people forget that especially when they sort by CPU
allocation then they won’t be that way. Yeah. Right, so, a couple of things. Still sort of interactive, but it’s a good
way to see what’s going on right now. The thing I like about it, though, is
you have the option to start monitoring. Now what that means is that you’d be
throwing that process into a template and be monitoring it on an ongoing basis. So that’s the real-time process monitor. And that, I would call your in between. It would be in the same category
as top and glances, what have you. It’s not a command you have
to issue instantaneously. It’s giving you an ongoing system but
you still have to watch it. Well, unless you do “start monitoring”
which allows you to then say you know what I do, I wanna keep an eye on this over time. And then-
Right. And then it’ll take care of it. So, and that gives you something like this.
This is a template for Bind. It’s a Linux-based template. And it’s looking at two things really. First of all,
it’s making sure the process is running. And it’s also at the same time
collecting how much memory and CPU is being used by that process,
which is always good. But it’s also collecting the number
of queries per second, and that’s what this graph is showing us here. And it’s even doing the queries for
multiple elements there, the different domains,
the recursion and so on. So you’ve got a template and we have
a variety of Linux-specific templates, everything from binding DHCP and
CUPS because printing is important. Printing is important. It matters. All the way to Tomcat, Apache,
your different kind of Linux applications,
your standard Linux type things. But you’re not gonna have to apply this, if you have 1,000 instances that you need
to monitor, you’re just gonna define this one template to either pull one out-of-the-box or to create one and then you can apply the same one over and
over again by just adding it to a node. Right, exactly. So you’ve got the hard work once to make
sure you get the template nailed down and then you push it out. But I wanna keep going. Well, don’t you also have history? Well, how much history does this record,
because that’s the valuable part I think when you have many systems to monitor that
you actually see things over time and get some idea of anomalies
versus this is normal. Mm-hm.
How do you know if you just all of a sudden look at a point in time? Right, okay. So, back from that eyeball
staring at screens. And it’s a good question. So, in SAM,
we’re keeping a year’s worth of data. Great. We do a little bit of summarization
there because the interfaces shouldn’t get too big. But otherwise we are keeping that
historical data which you can see down here for this, but you can all the way
back to a year ago if you’ve been running- Now that’s by default, out-of-the-box and completely tunable. Mm-hm.
Oh, that’s good. Yeah. So moving on though, what you also can do is you can
take monitors out of other systems. Right.
So most famously, Nagios. Yeah. So I know that you’ve
done a little bit of that. Yeah, Nagios monitors,
I’ll show you one right quick. The great thing about it is if you
already have Nagios in place what usually happens is you’re transitioning to SAM
because you’ve just reached the point that you’re spending more time configuring it
than taking advantage of the data, right? But you don’t wanna have to
start over immediately with SSH based scripts directly
that would replace those. I mean, over time,
that’s where you wanna go. Right.
But the first thing to do is just grab all the
data that you’re already grabbing and one of the easiest ways to do that is come out
to your application monitor templates and there’s a great one here
that’s ready to go. And if you’re not using search to find
resources here something is really wrong. So here’s one for Nagios. Likes the file directory count, right, so I’m gonna say edit and we’ll
pull this thing up and take a look at it. And you’re gonna notice something that
looks really familiar here, right? So again the top here this is a template
so a template again contains multiple monitors. The monitor that I’m looking
at here is just a single one and this is a Nagios script monitor. It’s using SSH to upload the server,
run it and then process the result. So basically, it’s exactly what
it would have been before, this one happens to be a Perl monitor,
right? It’s returning it in the same format, delimited the same way that you would
expect it to come back out of Nagios but it’s smart enough to know
what to do with that. You can add the script if you want to but,
you would usually just use the scripts that you already have and then everything
else after that alert thresholding and the rest of it is all gonna be the same. So a lot of times what I recommend that
you do is if you have Nagios scripts that are working great, go ahead and
pull them in as a Nagios script monitor. Then, develop an SSH bot based monitor
that’s actually a script that will get you even more detailed data or maybe execute
a script that’s running on that machine that will pull say multiple elements back,
not just one or two elements. And then assign both of them
temporarily to that same node, verify that it is working and
then you can remove the Nagios monitor so it is a great way over time or migrating
from a really rich set of Nagios monitor to a higher performance SSH monitor. So really a built-in sunsetting kind of concept? Absolutely and verification. Great. Okay, thank you. So I want to keep on going so we’ll just
dosido one more time here. And I want to talk about
something which is really new on the horizon
which is Linux agents. Are you telling me that
Net-SMNP is just not? I am not. I am a Net-SMNP,
I’m an SMNP junkie. I love SMNP, despite all the bad
things that people say about it. No, but the Net-SMNP agent for Linux. We have tolerated it as an industry,
not just at SolarWinds for a really long time but it is so
easily upset by specific distros, by configurations on the boxes,
inconsistent CPU monitor. This…you guys have asked for a really
long time for the same qualitative information from Linux systems that you
get with WMI and RPC out of Windows boxes. This a way to actually do that. It is absolutely that way, and there’s one other benefit that
I’ll show in a second here. But, yes, we finally have
an agent-based option, if you want it. Right.
You can still do things with SMNP. You can still do all of that stuff which, again. Or SSH. Or the SSH strips, of course. So, the only thing I need to show on this
screen is the fact that everything’s the same. Except for the polling method is agent,
right there. But otherwise-
Everything is most certainly not the same. Scroll back just a little bit here and let’s take a look at what we have
here under Hardware Details. Yeah, okay. So we got chassis,
we got temperature, we got CPU and memory and it’s automatically
pulling all that for me. These are not separate
monitors that then all go to to a monitor count da, da, da, da, da. It is figuring all of this out for me. Fair enough. Okay.
Yep, you’re getting all your hardware and
all the really good hardware stuff, like your memory modules and
things like that. Right. Sorting out virtual, this is physical,
and some other things are little harder to get out of the OIDs that were
coming back from Net-SNMP. So yeah, so you’re getting a few other
things but again, your CPU, your memory, all your counters are gonna look the same. It’s just coming from a local agent
running it on top of that but here’s another thing. So I want to show you this application,
this Linux Disk Monitoring on Perl. Now, it doesn’t sound like
that would be difficult, disk monitoring is running Perl except
that this machine has a specific SSH port on it and
I didn’t code the port into this. I just assign the application
template like I normally would and it just started working,
why did it started working? Why did I not have to go in and edit every
single component like I normally do? When my Linux sysadmins
have set specific ports, so they need to send you the SSH key,
the pre-generated key. Cause you have one port. I have one port and
agent was already on there. So if you’re distributing templates
to a variety of Linux agents that have different, they’re beyond
a firewall, they have security protocols, they have special ports, they have
locked SS pre-shared keys, any of those. Putting the agent on first will
get past all those things. It will allow you to manage your entire
fleet of Linux systems without that heartache. And there’s a bunch of
different deployment options. You can push it. You can build a deployment that
you can actually run local. And in all of those cases it will actually
use the Orion instance as the repository for the binary so you can also be assured
once you’ve verified your code or signed it that you’re getting good agent code
especially if you have compliance issues. Right and I would love to spend
another ten minutes just digging into the agent but
I’m going to show some restraint. We will definitely be talking about
that in a future SolarWinds Lab. I was gonna say yes, we will be
talking about that in SolarWinds Lab. So we’ll be talking about it a lot. But anyway, the point is is that when
we’re talking about built-in commands versus sort of good enough to get
it done right now versus whatever, this is a really great way. This is one of those best of breed
options for getting monitoring done. Especially in hard to reach, now we’re
talking about in your environment, hard to reach places. You think in the live chat
right now it’s going yeah, oh, fun. Woo-hoo, yeah. I hope so. So the last thing I wanna show
is coming back to the beginning which is scripted options. You know, we have SMNP monitor options. We have a Nagios which is
sort of a wrapper around it. But I do wanna mention that if you know
that there’s a command in Linux that works well, that doesn’t mean you have to
abandon it because you’re using a tool like SolarWinds, like SAM. Here you can see that I’m using vmstat,
which is another command that we had. And it is wrapped in a whole script and
the output is simply being processed. And you can see that here. Now, the funnier part about this. And this is a multi-value script. This is a multi-value script, also something that we’ll be
talking about in the near future. I want to point out that
the template is called AIX_LPAR. I wrote this a couple of years ago when
I was working back at Cardinal with Josh Biggley. Josh, just shout out. Woo!
[LAUGH] So we wrote this because we had a lot of AIX_LPARs. And we didn’t have a template or
any way to get in there. Right.
So, but they had SAR. And so we wrote this script, but
I can apply it to my Linux boxes, because they have the same command set. And do a little bit of tweaking,
a little bit for this, but otherwise, it’s picking up
memory values and things like that. So again, we’re just wrapping a script
command that we knew how to do. Whether that’s TOP or SAR or whatever it is into a scripted
option in pulling up that data. And you can use the application
discovery to automatically search and apply these as well. To, yes, to apply these, to make sure that the commands
actually run successfully. So if I know, like,
let’s say I do a discovery on a particular version distro, I can return the list of machines that fit that requirement and
then automatically apply these scripts. So it allows me to also
bulk apply these as well. It’s not one at a time. Exactly, so the next thing I
wanna do is dig into MySQL, let’s just take a minute to reload and
then we’ll get started with that. [SOUND]
Okay, so we’re gonna talk about MySQL for a bit
because a lot of Linux systems use MySQL. MySQL is open source,
it’s easy to download, easy to install and pretty soon you’ll find yourself
being an enterprise database, when you really didn’t plan on it. Comes with pretty good tools. Yeah, it does. Actually Percona and MariaDB, they have
their own little sets of tools as long as, but I’m gonna start with the command-line. Good.
And just like you said, Leon, and you showed with Linux,
all those good utilities, you can actually get some good information
about MySQL with those utilities as well. But within MySQL itself, you can actually,
it comes with a performance schema and an information schema. So, I can log in at the command prompt. So,
you basically just started the shell here. Yeah, I just started a MySQL shell. And I can show databases and we can
kinda see all the databases in there. And I’ve got two here. Actually, we’ve got three
because Oracle has been so nice to actually put some views
on type the performance schema. But this is gonna give me all kinds of
good information about what’s happening inside a database. And with your applications,
there will be a lot of times databases are the bottlenecks of applications,
so you kinda have to go there to see. So, I can actually just, I’m just gonna show all the tables,
we’ll just connect to performance schema. And I can show tables. And what’s neat about this is we can
actually see not only the wait states, we can see the sessions. We could see all kinds of good
information about each session. How many executions? How many rows were examined? All that good stuff to
see inefficient queries. And so it’s a great way for DBAs, as well as, Linux administrators,
who often become accidental DBAs, just because they’re on Linux box and
they have to take care of it, so. [LAUGH] But that’s a good way to see that. And you can see,
you’ve got some current history. You’ve got a long history, if you want
more information, you can configure that. And you get all this good stuff, you get
the statement, you’ve got, like I said, the wait states, you get the threads here. I have got one query I
would like to show you. And we’ll just go to, I’ve got it scripted
and a lot of times you’ll do that just because it’s easier than to try and
remember typing it all out. And that’s a pretty common
thing in any sysadmin role, is that the commands you
are issuing more often, you have sort of contained into some
sort of batched process or whatever. And you can delegate them to somebody
else to take care of, too, which is nice. You can take vacation. [LAUGH] Or automate it. Or Crontab it or whatever, right. So, this little script here, and
I’ll just edit it here real quick, so you can kind of see. We’re going to
the events_statements_current, and then I’m going to the threads table. And basically, when you start looking at
queries, you want to look at how long you’re spending, what wait states, what
resources are using up or waiting on, and then you wanna look and
see how much they’re reading. If they’re examining all
the rows in a table but they’re only returning a few back,
then that’s an inefficient query. So, that’s kind of what you look for
out here. So, as you can see on my screen, I’ve got
the THREAD_ID, I’ve got the user whose running it, what database they’re in,
and you can see, I’ve got rows examined. Look at that number there. Look how many rows were actually
sent back to the process. A lot of work, for no reason at all. For no gain, right. So that’s just a real quick way to
use some of the local built-in tools just to quickly find where all your time
is being spent for in MySQL database. Now, the thing that comes with Oracle,
so you can download their workbench, and I wanted to show you that real quick, I
think I’ve got it right underneath there. This is not the workbench that I
grew up with, I’ve been using MySQL for a really long time and I can remember
when this was a very different beast. Yeah, Oracle’s come a long way
with actually making this better and more useful. But you could see here,
you can get your server status, so I can get kind of a quick peek of, okay,
what is my NODB reads per second, writes per second, how many connections. I can look at the traffic and
kind of see that there. But what I really like is that
now they’ve got a dashboard. And the dashboard here, you kind of get a
look and feel for how things are running. And you’ve got now about three minutes
here that you can kind of go back over time. So, it’s not a whole lot of history,
but it does kind of give you, and sets you into the state that gives
you a good feel, working feel. And you mentioned before
that depending on your distro, you have Maria or Percona,
they have a similar setup for that. Yeah, they have a lot of tools too that
you can download and play off of as well. So, here’s a question,
when you run this on a server, it gives you some additional control
to be able to start and stop and manage processes and configuration, but you can also run this off of your
workstation connected remotely. Do you tend to run it like
this through an RDP session or do you tend to actually put
it on your workstation? This one, yeah, this one is really
a workstation product, so you do have to, it’s a Windows product that you have
to download and then do it remotely. Or if you’ve actually installed MySQL
on Windows, you can do that as well. But I’m remoted in here to the Linux box. That what you, nevermind,
nevermind, nevermind. [LAUGH]
Keep going. [LAUGH] Okay, another thing that you
have here is your performance reports. And these are pretty interesting because
you can see I/O, a high cost SQL, you can see the statistics
of the database schema, your wait events there,
it’s very nice that way. So, you can actually run a report and
kinda get that and set that up to email and mail out. So, it is pretty nice for
just point in time. You don’t need much history here,
you can go back a little bit but for point in time and looking at it,
it’s a nice useful tool. Right, and I wanna mention that. So, we’ve got the command line options. We’ve got one built-in, one sort of, built-in desktop tool
that’s near real-time. It gives you a little bit of history but
not much. I’m just gonna mention that there is,
of course, a MySQL template built into
SAM that is also available. But that works like all
the rest of our templates but I really think that where we wanna go with
this is wanna show the best of breed. We wanna talk about like, when you really
have to monitor your actual databases and you actually have to do this for your job,
what is it that you wanna be seeing? Again, I’m not talking about a particular
tool, although, we may be mentioning that. But the fact is that you need to
have an enterprise class solution. And these are not that. Especially, if you don’t want to be
managing a database but you have to. Yeah, definitely in those cases. But even if you are a DBA, you don’t
wanna be cat wrangling all day long. Or you’re a SQL DBA or DB2 DBA-
Or an Oracle. Or an Oracle. But someone says, oh, you know what,
you’re already a DBA, I’m sure you can also manage MySQL. Right.
It’s the same. Database, database. [CROSSTALK]
Oh by the way, here’s a hundred of them,
Right. Can’t you just handle that? Handle that. So, that’s when it becomes hard because
when you have tools like these, they are mostly point in time. You have to focus in on a specific one,
you can’t do this kind of generic triage that you would
with the enterprise wide tools. So let’s,
what do we have to look at here? Okay, I wanna show you
the Database Performance Analyzer. We call it DPA for short. But this is kind of your enterprise
solution for all databases, not just MySQL cuz it actually
monitors more than that. So this is looking at all the discrete
data that I would be able to get to with the command line,
if I knew the right spell, but it’s taking care of that for me. Yeah, and it’s doing more than that
cause it’s actually keeping up to five years of history. So you can go back and see trends,
you can get baselines, you can do all kinds of good stuff to
find out that whole thing about what’s normal versus what’s abnormal. Because a lot of times when you’re
in a database and you’re trying to solve problems you know it’s like
finding that needle in the haystack. And if you have something to look back on,
and say well how did this run yesterday? Or how did it run last month then you
have an idea of where you’re looking at. Or what happened in the middle of
the night while we were doing backups. That’s when things tend to go sideways. Yeah, yeah or
when I was on vacation last week. And then,
everybody’s yelling at me this week. Yeah So anyway. So here we’re looking at I’m monitoring,
up to 23, I’ve only got 20 monitors on right now,
23 instances of different varieties. But you see I got this master slave kind of accommodation of MySQL
here that I’m monitoring. And as you notice the slave is one of my
busiest servers and as you see up here, highest wait time. This is actually looking at the response
from the database to your end user processes or your end users if they’re
sitting there waiting for a response. So you know, lots of wait. On a specific day,
there was 75 days of wait. That means every active
session waiting for a response from the database
in one 24 hour period. One day was 75 days long. Yeah.
[LAUGH] That’s a lot of, that’s a long day. So that’s one of my busiest databases. I want to look there for sure. We start trending up and
trending down over time so you can kind of get an idea of,
well, there’s no activity. Maybe my problem’s elsewhere. Or, we actually alarm amount on simple
things like CPU to memory disc. Most monitoring tools do that. Blocking, locking we show. We have an advisor that
actually gives you advice. So, a lot of that monitoring and
stuff at the command prompt or even at workbench you’re having to
sit there and use your knowledge. Well, with DPA you can suddenly pass
off some of this to maybe junior DPA’s. Because, they’re trying to learn. They’ll learn as they go through here so
it’s kind of a cool tool that way. I like it as one of those accidental
admins who really has never been trained as a DBA, it’s not that this is going
to do my work for me but this is going to give me the right questions to go ask
and the right things to go dig into. Because when you first dive into
the ocean of database administration. All the fish look pretty and all the
colors and everything looks very exciting. But you don’t know which ones
are actually dangerous and which ones are actually just interesting,
but okay. So this really helps bubble those up. Yep, that’s true. Okay, so databases can be very complex. And MySQL is no different
than any of the others. They all have their different tools and
their different behaviors. It’s a very mature product. Yeah, so if you wanted to start someplace to try to define where do I start,
particularly if you have very long-running SQLs, this tool actually
has an advisor that you can use. Go to the advisors and see what they say. You can see here this one’s spending
over 100% of it’s time sending data. Well, if you’re an accidental DBA you
may not know what sending data means. So you can actually click into the advisor
and get some good information. And not only that, you get to see the queries. You can
see how much time it’s taking up. This is almost 50% of the whole
instance execution time in one query. So if I could fix that I could give back. So I don’t know,
the advisor is a good place to start. Sending data,
all of our wait types and all that are documented here so you can kind of get
an idea if you don’t know what they are. And I will admit that I
don’t know what every single one of these elements means even when
I am comparing database to database so it is helpful because a lot of times. The same term means something
completely different with a different vendor’s product. Right, and this just becomes a really
good learning tool for those junior DBAs or accidental DBAs, that we flag
the problems that we talked about before. We tell you what’s important
versus what’s actually okay. It just Is a big number but
it’s an okay big number. And then, here we help you dig down and
understand the terms and even if you have to go to somebody else,
you’ve got the words. Yeah, one most definitely you can
go here down to the supporting data and kind of see what it did up to
a certain time, and you can ask for advice in this tool anywhere in it. Which is kinda neat, because a lot of
times you get down into something and you’re just not sure about, just go to
the advisor and it’ll tell you times and everything else that you wanna look at. Well, I think another way that
I like to approach this because I’m more systems I was a DBA once upon a time, as an okay DBA. But I like to approach things more
about resources, what would exist and sort of walk down the resource list
to see how things are organized. So a lot of times, when I’m troubleshooting an issue,
I think about resources. So, when you go into resource views, think about the learning about
the things that you don’t know, right? So being able to do things like,
if you wanna go and explore something like CPU metrics,
of which there are many in databases. Or you want to take a look at
network objects believe it or not, there’s an awful lot of networking
metrics that are available. And just walk through these lists and
click and explore what these metrics look like and a lot of times you’ll start to
see things that you’re familiar with. But the other thing is, to your point
earlier, talking about being able to detect those changes that something
changed, right on the network. Is definitely look for event correlation. This is taking advantage of the VM plugin. But how many times does vMotion
do what it’s supposed to do but with maybe a not exactly perfect. Unintended consequences. Consequences, right. And it decides we need to move this
database and then here I can say oh, you know what, I now have a lot more, my
performance and my queries has gone down. Yes, there was a phantom
move in the background. So thinking about events as a part of
what’s going on in a database is really, really important. And then correlating that back
with each of the resources that might be driving that. So the interesting thing about this is that MySQL is one of those hard to
reach places that we hear about. But everything that we just did applies
to another database that we hear about sometimes as far as, this is a challenging
thing for us to monitor, which is Oracle. But it seems to me that Oracle
should support command line. It should support a workbench. I mean,
a lot of this should carry over, right. Yeah, it does. You’ve got your same, what they call
virtual views that you can access and get all the thread information and
all the process information. But you can also get Enterprise Manager. Mm-hm. And if you buy the tuning and
diagnostic packs, you can get the same information
we’re showing here in our tool. So there’s really. Functionally, there’s no difference but I definitely wanna see what Oracle
has to show us on the screen. Okay, yeah. So here’s the same look and feel because
we were just in the MySQL database but I can also monitor many different types. And so, this happens to be a 12c database. So I’ve got a container database here
that has multiple databases within. And I’m monitoring all of them right here. So it’s just a real quick way to see it. You can monitor,
if you’ve got real application clusters or exit data, it has all that good
information in there to actually see. But can click in here and get the same
look and feel that you did MySQL. You can get the advisors. You can correlate with the resources. If you’re more into resource
instead of wait analytics. But you get all that good information and
actually see that whole five years of, again, finding out what’s normal
versus what’s not normal. Right.
And the same thing should carry over for. I don’t know Informix. Sybase. Sybase. Or now DB2, yeah. Right not to start a religious war. We’re not saying which one is better. We’re just saying that. None of them are actually
that hard to reach. Right. And monitoring and managing, again because we were talking about
monitoring these hard to reach places, it shouldn’t require a term you coined,
which was swivel-chair integration. I wish I had coined it. I’ve just resurrected it. Oh, okay, well I love it and
I keep using it. But the idea that I have all these
different screens that I have to use and then mentally integrate everything Here
you’ve got everything in one place. Well, and the other thing too is,
when you focus on single dashboards, it allows you to just learn the
differences between applications, right? So, the fundamentals that you learn,
you become, let say a pretty decent
SQL server administrator. Maybe this isn’t fair to compare SQL and
MySQL, but then you put them side-by-side
in one tool, any tool, ours or anybody’s, it makes it easier to figure
out the differences when you learn those new named metrics, when you learn
the differences about the way that they handle wait times or
latching or something else. So you can almost,
by choosing a common tool, you can morph your knowledge about onei
nto one that you’ve accidentally ended up having to manage when a manager says
oh, that’s a database, it’s the same. I’m sure you won’t have
any problem with it. Right, alright, so
this covers databases, again, especially the ones that we hear about
a lot at conventions, but there’s one more hard to reach place and that’s monitoring,
believe it or not, monitoring Java. That is not a hard to reach place. But we hear about it all the time,
I don’t know, you hear about it as well. I do, but
there’s configuration involved. Yes.
And people kinda stop at heap and they stop at command line configs and
then you say JBoss and they kind of walk away but
it’s, it’s doable. So I wanna take a minute and
just reload the screen and then I wanna dig into it. So with monitoring Java I think
its worth mentioning that all of the command line tools you can do PS,
PS-EF, you can monitoring your processes
that all exists, right. And then there’s also a lot of logs. There is a ton of logs. There is a ton of logs and you can do
your typical log file monitoring in fact there’s actually log file adapters for J? Yep. So log for J? Yeah log for J, or
maybe you want to use one of the log for J adapters and kick it up to Papertrail. So we’ve got that also. So those all exist and we don’t
want to ignore that they exist, but this is, again, what can we do hands on,
what’s built into the tool? And one of the first things you
told me about was JConsole. Yeah. So I wanted to take a quick look at that. Yeah, JConsole is we’re looking here is
basically it comes with the installation. So you can just go to the JRE
directory and connect, and you can see my connect strength here. We’re actually, DPA is a Java tool,
a Java-based application. So you can actually get
some good information about different applications in Java. And what this is showing is the overview
of it, and I can see my heat memory usage. I can see all my classes,
the number of them. I can see the number of threads. That’s important,
because if you’ve got memory leaks and you keep getting more threads and more
threads, that should be a stable number. So you are eating up memory there. And your CPU usage. So you get, that first screen,
you get quite a bit of information. You can also do, you can look,
deep dive into the memory. And look at the threads more and
go on and so forth about that. But it’s a great first point-in-time tool
to actually do any debugging, I guess, if you’re having issues. It’s task manager for Java. Yeah. Yep, there you go. Now, it’s not recording history. No, it isn’t.
It’s not real time. I mean, it’s real time,
it does not have any history at all. So, you don’t have a baseline or anything. You gotta sit here and look at it. Right, okay, so taking that to the next
step within SAM, we have a JMX monitor. We have a component that will pick up JMX,
so I wanna take a quick look there. Now, don’t move. It’s just a quick thing here. But here what you see is that we’ve got,
I threw together a template. Also pointing at our DPA servers,
just what we had at hand. And you can see we’re collecting
a few different GMX monitors here for the same thing. For heap size, and stuff like that. So for example, tell me a little bit about what these are showing us in
terms of DPA as an application. So you can see your heat memory
usage and see if it’s growing, you can actually get some information
about your garbage collectors too. It could be growing by leaps and bounds. And just to control memory as
a whole in a Java environment. Also you’ve got here execution time and
your min, max and averages so you can kind of see how
it’s performing as well. Right and what I learned from putting
this together, cause I’m not a Java guy, is that unlike maybe some of the shrink
wrap application like SharePoint, or Exchange or whatever, where you can pretty
much throw our monitoring at it and start to get back data. You have to understand
your Java application. You have to have a sense of these are the
important counters, the important elements of it as the application was written
because it’s a custom application. Yeah, a custom application or it’s
running an application server container. Right, so you have to have a little
bit more awareness of what’s going on, I couldn’t really set this up until you
were sitting next to me and say, yeah, we don’t care about this and we don’t care
about this, oh that, that’s a good one. Okay, but let me untangle JMX just a little bit. Okay.
Let’s set up one from scratch. I’ll show you sort of the difference
of the way it works and we’ll use JBoss as an example. I’d love to.
Okay. Okay. So remember we didn’t really
talk about it that much, but when we talked about JMX,
if you’re a Java admin, just I’ll get through this in
a couple of minutes, ignore me. You all ready know what I’m about to say. But if you think about what WMI is, right. WMI libraries of metrics that
are organized in a way in a container so that you have applications that
spit out common types of data. JMX is an attempt to do that for Java,
right, so it’s Java management extensions. And the way that it works
is in yourexample or you connected directly to the main JVM,
right? Yep. You were not necessarily
using a lot of JMX. If an application exposes a JMS management
framework, it’ll expose a set of classes as M beans, and you’ll see those in
a minute here, right, so management beans. And then those classes have attributes, so all you’re really doing
is giving it a few things. You’re giving it a port to connect to,
the JMX server port, and one of the things people forget about with JMX is that you
need to turn it on in your applications. That’s where a lot times you get lost and
there is actually some THWACK articles out there that will walk you through how to do
that to make whether it’s just enabling it for JVM itself or
a particular application. When you get into something like Jboss or
IBM web sphere or one of the other containers now
you have multiple levels, right? You’ve got the server itself,
then you have the container that’s managing the different
applications that are running on it and then the layer below that is
the individual applications and the great thing is that if they’re
enabled correctly, and for the most part, especially like
with JBoss and the rest of it. There’s a certain default level
that are exposed on a port. You can connect right to them
with a common set of attributes. And a lot of these templates actually
ship out-of-the-box with SAM. But finding additional ones is or
building them is pretty straightforward. Google is your friend. And for just about any application that
has any sort of decent monitoring for Java, they’re available. And so when you look at things like,
so again, this is JBoss. So we’re combining both the Java, the JVM-specific metrics along
with the JBoss-specific, right. So if I go in here and I look, for
example, at JVM heap size, right, so this is basically just the integer size in
bytes of how much the heap is consuming. Right, I’ve got a port number, that’s
the network port that it’s connecting to. It’s giving a URL path that it’s
expecting, like the server, actually like WMI,
where there’s different namespaces. That’s basically the namespace
where it finds it. Then it’s gonna give me an object name,
I’m gonna give it the type, so in this case it’s javalangtype=Memory, so
that’s the memory, the MBean object that it’s expecting and the attribute that it’s
gonna be connecting to is HeapMemoryUsage. So that one one is based
on the JVM itself, right, the core of the Java machine in JBoss. The core or the encapsulated Java machine. But then if I go to free memory well this is now going to be something
specific to JBoss itself, right. So if I look at this one, it’s
the same management ports, it is putting it on the same URL and
again you can do RMI or IOP, but the object name is different, right. So, the class here is JBoss system,
so it’s a JBoss specific class. That’s it’s type and so
it’s part of that package. The type of server info. And
the attribute name is free memory. And so for just about any application they
will expose what those object names and types are so you can map those attributes in. If it’s one that’s not included
in the template and then they look like any other monitor after
that, so you’ll have statistic thresholds. If it’s one that supports Warren or
some of the other attributes, It’ll be listed there and then you can do dialect
thresholds and everything else just normally like anything else and that lets
you do things like put them side-by-side. So for example, if you have a Windows
server that happens to be running JBoss you can actually build the monitors for
the server instance itself. Any other processes that make up that
application along with the application containers that are installed on JBoss and
JBoss itself is a simple template and then apply that to as many machines as you
can, or you migrate that over to Linux. You just transform the template to monitor
the Linux instance that it’s running on and any other containers and
pull all the data in. So JMX is incredibly flexible. It’s been a part of SAM forever. Just spend a little bit time on Google
getting those type classes, and you can monitor anything in JMX. And I also say that one of the things
I discovered from putting that template together was that. I used the SAM wizard. There’s a GMX wizard. So if you’re out there worrying about
how am I gonna know all these things, the GMX wizard will actually, once you
make the connection to the system, give you a list of all of
those hierarchical elements that you can then go through and pick-
Right. Which ones you want. That’s right. So you don’t have to know
everything off the top of your head. It’s there. We can get you to it. It’s named and it engages and invites you to explore what is
actually exposed on that system. Right.
So, Java is clearly not a difficult…
a hard to reach place. It is not a hard to reach place. It’s a different place. Yeah.
Yes. It has it’s own theories, but
it is not a hard to reach place. Not at all. So I think we’ve
demonstrated that not only is it possible to monitor an awful lot of
things that people say we don’t monitor. We monitor all the things. We monitor all the things but
it actually pretty straightforward. I mean, do you really think
that after this session that they’re still going to say that those
hard to reach places are something that are just impossible like Linux and
SQL and Oracle and everything else? Probably. What? Well for good reasons. People’s environments and
requirements are always changing. They’re changing constantly. So the folks that are needing this
stuff tomorrow may not know it today. So tomorrow when it comes up
they’ll be trying to catch up. Okay well, fair. And it’s good that we’re
recording this for posterity then. Absolutely. And the other thing is we’ve empowered
you guys to actually go out and spread the gospel of monitoring glory. Hallelujah! So that may be, just maybe,
some of these newcomers, people who are just coming to THWACK or
just coming to the product, so be able to start off with deep monitoring of a wide
variety of applications with no heartburn. Well, we can only hope. So Patrick, Janice, thank you so
much for coming on today. This has been fantastic. Thanks to everyone out there for
joining us. And again hopefully you’ll
have easier time reaching all of those hard to reach places
with all of your monitoring tools. Yeah, thanks again. Thank you.
Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *