When I hear Key Performance Indicator I get a little worried because usually the next thing done is to link performance to reward and then you can get unstuck very quickly. I am always reminded of the software firm that decided on a reward system around bug fixing - the testers would be rewarded for finding bugs and the developers rewarded for fixing bugs. Development ground to a halt as an instant black market formed around the insertion, detection and correction of bugs.
Your organisational KPIs should be customer focussed. Depending on the type of software product you are making, you can measure it in the following ways:
Sales - Is your product meeting customer requirements? You may be able to measure this in terms of the ratio of software presentations to sales or visits to your web site's purchase page to actual purchases
Quality - Is your software understandable and reliable? How many support calls per customer do you get per day? Are the questions about how to do something or errors?
Customer satisfaction - How satisfied are your customers with your product? Survey your customers and find out what you could be doing to increase their satisfaction then survey them again later to find out if you've improved. (Don't annoy your customers by asking a lot of questions or doing it too frequently)
Yes, these indicators seem to have nothing to do with the base level software metrics like bugs found and lines of code produced. However, the problem with bugs found is then you have to grade the severity of the bugs, and refactoring will often reduce your lines of code. Timeliness only matters if you are meeting your customer's expectations of timely delivery.
Concentrate on the business goals. If you have customers buying your software, they don't need a lot of support to use it and they are happy, then your software organisation is successful. No measure of bugs detected, schedule slips or anything else will matter if you don't have those three things in place.
If your software project is like the majority out there, it will be late, over budget, ship with less features than anticipated and have bugs. Don't beat yourself up over these things, deal with them and move on. Yes, you need bug databases, source control, testing and a way of measuring project velocity but in the end if you don't meet the business outcomes then you can't be successful, regardless of how polished and shiny your code is and how few bugs it has.
Update to try to address the revised question
KPIs as you desire to use them are difficult when delivering an intangible product that is also often a moving target. Will your KPIs used this year on an accounting system have the same meaning next year when you are implementing a document management system?
Let's take as an example a profession where KPIs are used widely - lawyers. Measuring lawyers uses KPIs such as: average billed hours worked per day; hours billed per month; age of debtors ledger; average age of unbilled work; percent of billed fees written off; and so on. You should notice a trend here - all these KPIs relate to willingness (or not) of clients to pay for the services rendered. This is the final arbiter of success and is why I suggested (above) some ways you could use this type of measurement as KPIs for your software business.
When you try to get down to having KPIs that don't relate to your client's willingness to pay for the value you are providing, then we get down to problems with what we are measuring, how you are measuring it and what differences there are in the measurement or what is being measured this year as opposed to last year.
"Dollars paid by clients" has a fixed value year to year - arbitrary metrics like "bugs in software", "timeliness of release" and "flexibility" don't have a fixed value and an increase in the KPI may not have a direct relationship to the underlying value that is meant to be measured by the KPI, such as "more bugs means lower quality".
For example, after the Columbia disaster, I recall the investigation team came up with several hundred recommendations and items to be investigated. Did these newly discovered "bugs" mean the space shuttle suddenly had a lot less quality? Actually, after the investigation the space shuttle had more quality. So a KPI around bugs can easily be distorted by an extensive QA session and more bugs reported may actually mean your software has higher quality.
Productivity in terms of timeliness of releases is easily distorted by commercial factors, such as a client throwing money at you to do some custom development for them. Your release schedule will slip but your business will improve.
As for flexibility, I can't even hazard a guess at how you would measure something so intangible.
About the only measurement I can think of that has value this side of the client's wallet is project velocity - how much did we estimate we would do last iteration/cycle/release and how much did we actually get done? Then plug this figure into the time available for the next iteration/cycle/release to estimate how much you will probably be able to get done this time. You can display the time remaining in a burn down chart or similar during the iteration.
The rest comes down to process which I don't think you can pin down to KPIs. All you can do is make sure your developers know what everyone is doing (daily developer meetings), your extended team gets input (weekly or fortnightly team meetings), you understand what worked last time and what didn't (retrospectives) and above all you have transparent effective communication.
Unfortunately, I don't think there are any magic KPIs like you are after (but don't overlook the relevance of getting money from clients as a KPI).
By far the best single indicator is "tested functionality delivered and accepted". In the Agile world, we usually measure "functionality" in terms of "user stories" but it can be in any convenient form as long as it's measuring actual functionality delivered and tested, acceptable to the customer.
The usual other measures, like SLOC, SLOC/staff-hour, etc, fail because of Charlie's First Law of Management, which is:
People will deliver whatever they're being rewarded to deliver.
Set up your measures as SLOC, and you'll get lots of SLOC. Use SLOC/hr, you'll get lots of SLOC/ht. Give them bonuses for working overtime, you'll get lots of overtime.
Oh, and remember the correlary, too:
What people are delivering is what they think will be rewarding to deliver.
If you're not getting what you want, ask why it's rewarding to do the stuff that's getting done.
Benno, I'm answering your comment but didn't have enough characters there for the answer.
This depends on the problem you are solving. For instance suppose the issue is that the time from when the developer checks in the code until it is actually placed in production seems too long. Then you would get a baseline measurement of how long it is taking. then you would put in your change and then measure for a period of time to see if it now takes less time. You might also check something like the number of times the solution was determined not to work and sent back for rework before and after as well to make sure that the solution is not faster but lower quality.
Now the problem in IT with these measurements is that it may take quite some time to accumulate enough data as some problems do not re-occur frequently. In this case you may have to start by relying on subjective data until you can accumulate enough data to know if the change was good or not. But do not ask if something is an improvement until the users have gotten used to it. The first week or two of a new process, you will meet resistance to change and thus will get bad subjective results if you ask too early.
Another thing to be wary of is that if people know you are measuring something, they will be afraid that their personal performance is being measured and thus will game the system to get good results. It is often best if you can take measurements based on some system already in place (we have a a system that manages requests for software changes, we could query the database to find out historically how many requests missed the deadline, how many we reopened after being closed or are related to past requests etc., what the time differential between developer finishing and code being moved to production etc. are). You may also have to consider eliminating severe outliers, especially if the time spans the time period of both the old and new system. For instance we have one request that has been in Qa for over a 100 days not becasue it is bad but because QA has an avaliability problem and this one is lowest priority so it keeps getting bumped for higher prioroity work. This time would not be valuable in measuring the improvement of time, becasue the factor making the time so long isn't the process you are trying to fix. If you graph the data, you will easily see the outliers that might need excluding.
编辑:在 Steve McConnell 的 Construx 网站上可以找到一些很好的讨论。 [是的,这就是《Code Complete》中的史蒂夫·麦康奈尔]
Basing your KPIs around Cost, Quality & Schedule would be a good start. Consider what are the attributes for each of those you want to measure.
Being able to split each of these measures to show the cost of bugs would be useful - lots of bug fix effort late in the project means cost/schedule blowout. Being able to profile which parts of the codebase are buggy could help in targeting additional testing & possible code re-writes - typically 80% of the bugs will come from 20% of the code. Knowing where that is will allow your team to focus better.
EDIT: Look at measures like Cost of Quality (CoQ) and Cost of Poor Quality (CoPQ).
Measures like productivity are always hard to quantify - for example, using LOC/day leads to a debate about what exactly is a line of code? It can also lead to silly code formatting to "boost" productivity if the developers don't understand why these things are being tracked or perceive them as personal measurements. Even if LOC/day is not measured at the developer level, you can still get team rivalry leading to the same result.
EDIT: There are some good discussions to be found on Steve McConnell's Construx website. [Yes, that's the Steve McConnell of Code Complete fame]
No process is going to help you improve what you do except actually getting everyone together and figuring out what is working and what isn't working. For the team I'm currently leading, we do this through a series of Retrospectives (of which I'd highly recommend this book). The teams generally know what parts they want to improve - the trick is giving them the empowerment to actually measure and improve those things.
Yes, you certainly still need someone looking at a Macro level. If you look at an organization like Toyota, they have a Chief Engineer who straddles that line between business and production (for a good explanation, see Scott Bellware's blog post). In our organization we have someone similar - my boss was one of the initial developers of our product nearly 20 years ago, and stays highly active in the tech side, but is heavily invested in the customer side as well. My job is also to look at the teams as a whole to suggest improvements.
To measure, we first make sure that any improvements we strive for are things our teams can actually change, and then use something akin to the SMART goals so that any improvements are measurable. We have a Big, Visible Wall which we post the notes from the retrospective on. This happens to also be where we hold our daily stand-ups, so it gives us focus on what is going on.
For rolling up statistics to our executive meetings, we focus on code delivery - not lines of code delivered. I purposely kicked the team off measuring in nebulous units meaning that we don't report up that we worked x number of hours, or days, or whatever. What they do see is a trending chart of how well we are delivering our features and how we are improving. We'll also include interesting tidbits when the team feels they want to share it.
The best part about all of this is that we can try things for a month, and then reevaluate it just 4 weeks later. This creates a much lower barrier to entry for trying new things, since the team knows that if it is impacting them, we'll either cancel it immediately, or we'll reevaluate and find better ways at the next retrospective.
The bad part is that it isn't exactly what you are looking for. There isn't one metric or set of metrics we continually follow. We watch trends at all times, and measure the ones that we think are interesting - but only for a little bit, and only when the team is setting out to achieve a specific goal because of them. But in general, I'm quite happy with how it works, and I've seen a marked improvement in the involvement of the teams in improving the process. We aren't quite Kaizen, but we're getting better every day.
I did process improvement professionally for 14 years. Here is my advice, stop trying to quantify and start talking to people. Measurement works fine for a specific problem (once you know the problem, you have a better idea what to measure) and for repeatble processes like manufacturing. Your people know exactly where the problem areas are, so do your customers and users (from a very different perspective). Flow chart (use industrial engineering symbols not computer programming symbols) out the actual process for areas where there are concerns(not what we pretend the process is, you will need to observe as well as ask questions). Once you see the whole flow of the process look for delays, areas where the work is duplicated, areas where there is unnecessary process (usually due to adding more steps to the process to account for human error, thus creating many more potential areas for human error). Question the need for each step and whether there is a better way to do each step. Test potential changes and see if in fact they are an imporvement (way too many times they make the situation worse not better). Do not under any circumstances only talk to managers when getting a feel for problems or when flow charting. You will not get a true picture and will thus solve the wrong problem.
Understanding waste and value-stream mapping will show you where you need to make improvements, and from that knowledge, you will learn what you need to measure. Principles of Lean and Kanban apply here. Understanding waste and it's effects on producing software will start you down the specific path to betterment that is inevitably specific to your organization. You can't take a cookie-cutter approach. Read (or listen to) "The Goal" and "Lean Thinking" for more on this really amazing and eye-opening perspective of what's wrong and how to fix it.
The best use for Key Performance Indicators is for driving (or steering, if your prefer). For course-corrections in real-time.
(See Dashboards are for Driving for more blather about this sub-topic. Caveat: I am the author of the blathering article.)
So, the question back to you is: are you trying to evaluate performance after the fact, when it is too late to do anything about it, or are you trying to find KPIs that can help you stay on course?
If the former, any metric your organization cares about (bug count, ship-date slippage, lines of code with comments, customer return percentages, etc.) will be fine. Measure away and good luck getting any better in between shipping products and upgrades ;-)
If the latter, choose velocity. Assuming you are using test-driven development (TDD) of course.
EDIT: so it's the former. Well, here's why you are probably out of luck:
Suppose that you decide that "quality" is best quantified by measuring the number of bugs reported by customers as your post-process KPI. Let's assume that you are using TDD, and say that your team delivers Product #1 in 6 months, and after 6 months in the field you find that you have 10 customer-reported bugs. So now what, exactly, are you going to do to improve your process? Test more? Test specifically for more things like the causes of the bugs that were reported? It seems to me that you would already be testing, and when bugs are discovered - whether by the customer or not - you add a regression test for the specific bug and additional unit tests to make sure that there are no more similar bugs. In other words, your post-process improvement response will be no different than your in-process improvement response, so this KPI is really of no significant help in improving your process. The point is that the way you improve your process remains the same regardless of whether the bugs are discovered 6 months after release or two days into coding. So while this might be a shiny KPI to put on a manager's wall or a departmental newsletter, it really will not change your process-improvement mechanisms. (And beware of putting too much stock in this KPI because it can be wildly influenced by factors beyond your control!). In short, knowing the number of bugs does not help you improve.
(There is another danger here, one commonly found not just in business but also in the military, and that is the illusion that the post-mortem analysis revealed valuable information, so the lessons learned post-mortem are vigorously applied to the next project, which is probably not the same as the last project. This is known as "fighting the last war".)
Suppose the number of customer returns/refunds is your KPI of choice for "quality" - if this number is 5, what does this tell you? The specific reasons why customers requested a refund may be some indication of quality problems ("too slow", "doesn't interface with XYZ system", etc.), but the mere number of such incidents tell you nothing. A variance against an expected return percentage might tell you if quality was improving, but again the number does not help you improve. You need more information than the number can give you.
So for "timeliness of releases" what measurement would be appropriate? Number of days of ship-date slippage? Percent overrun based on original estimates? It doesn't matter, because again the numbers do not help you improve.
If you can measure "productivity" after the product is done then you can probably measure it while the product is being developed (e.g. velocity), the difference is that productivity less than expected during development can be improved immediately, while an overall productivity number measured after development is completed is too gross, too averaged, to be of any use. One could only guess at why it was lower than expected 6 months later...
I have no idea how one might measure "flexibility", that sounds like marketing jargon ;-)
I hope I haven't pounded this nail too hard or too far, but I don't think that there is anything useful that you can measure after-the-fact that you cannot measure while in progress. And there are a lot of after-the-fact measurements that are useless without knowing the causes.
发布评论
评论(9)
当我听到关键绩效指标时,我有点担心,因为通常接下来要做的事情是将绩效与奖励联系起来,然后你很快就会摆脱困境。 我总是想起一家软件公司,该公司决定围绕错误修复建立奖励系统——测试人员因发现错误而获得奖励,开发人员因修复错误而获得奖励。 由于围绕错误的插入、检测和纠正的黑市立即形成,开发陷入停滞。
您的组织 KPI 应该以客户为中心。 根据您制作的软件产品的类型,您可以通过以下方式进行衡量:
是的,这些指标似乎与基本级别的软件指标(如发现的错误和生成的代码行数)无关。 然而,发现错误的问题是你必须对错误的严重程度进行分级,而重构通常会减少你的代码行数。 仅当您满足客户对及时交货的期望时,及时性才重要。
专注于业务目标。 如果您有客户购买您的软件,他们不需要太多支持就可以使用它并且他们很高兴,那么您的软件组织就是成功的。 如果你没有做好这三件事,那么检测到的错误、进度表或其他任何事情的衡量标准都不重要。
如果您的软件项目与大多数软件项目一样,那么它就会迟到、超出预算、交付的功能少于预期并且存在错误。 不要因为这些事情而责备自己,解决它们并继续前进。 是的,您需要错误数据库、源代码控制、测试和衡量项目速度的方法,但最终如果您不能满足业务成果,那么您就无法成功,无论您的代码多么精美和闪亮,以及如何它有几个错误。
更新以尝试解决修订后的问题
KPI,因为在交付通常也是移动目标的无形产品时,您希望使用它们是很困难的。 当您明年实施文档管理系统时,您今年在会计系统上使用的 KPI 是否具有相同的含义?
我们以一个广泛使用 KPI 的职业——律师为例。 衡量律师使用的关键绩效指标包括: 每天的平均计费工作时间; 按月计费的小时数; 债务人分类账的年龄; 未开票工作的平均年龄; 注销费用的百分比; 等等。 您应该注意到这里的一个趋势 - 所有这些 KPI 都与客户是否愿意(或不愿意)为所提供的服务付费有关。 这是成功的最终仲裁者,也是我建议(上面)一些方法的原因,您可以使用这种类型的测量作为软件业务的 KPI。
当您试图认真考虑与客户是否愿意为您提供的价值付费无关的 KPI 时,我们就会认真考虑我们正在衡量的内容、您如何衡量它以及它们之间存在哪些差异。与去年相比,今年的测量或测量内容。
“客户支付的美元”每年都有一个固定值 - “软件错误”、“发布的及时性”和“灵活性”等任意指标没有固定值,KPI 的增加可能不会产生直接的影响与 KPI 衡量的基本价值的关系,例如“错误越多意味着质量越低”。
例如,在哥伦比亚灾难之后,我记得调查小组提出了数百条建议,要调查的项目。 这些新发现的“缺陷”是否意味着航天飞机的质量突然下降了很多? 事实上,经过调查,航天飞机的质量更高。 因此,围绕错误的 KPI 很容易被广泛的 QA 会议所扭曲,并且报告的错误越多,实际上可能意味着您的软件具有更高的质量。
发布及时性方面的生产力很容易受到商业因素的影响,例如客户向您砸钱,让您为他们进行一些定制开发。 您的发布时间表将会推迟,但您的业务将会改善。
至于灵活性,我什至无法猜测如何衡量如此无形的东西。
我能想到的唯一对客户钱包有价值的衡量标准是项目速度 - 我们预计上次迭代/周期/发布会完成多少工作,以及我们实际完成了多少工作? 然后将此数字代入下一次迭代/周期/发布的可用时间,以估计这次您可能能够完成多少工作。 您可以在迭代过程中以燃尽图或类似形式显示剩余时间。
其余的都归结为流程,我认为您无法将其固定为 KPI。 您所能做的就是确保您的开发人员知道每个人在做什么(每日开发人员会议),您的扩展团队获得输入(每周或每两周一次的团队会议),您了解上次哪些有效,哪些无效(回顾),最重要的是你们有透明有效的沟通。
不幸的是,我认为没有像您所追求的任何神奇的 KPI(但不要忽视从客户那里获得资金作为 KPI 的相关性)。
When I hear Key Performance Indicator I get a little worried because usually the next thing done is to link performance to reward and then you can get unstuck very quickly. I am always reminded of the software firm that decided on a reward system around bug fixing - the testers would be rewarded for finding bugs and the developers rewarded for fixing bugs. Development ground to a halt as an instant black market formed around the insertion, detection and correction of bugs.
Your organisational KPIs should be customer focussed. Depending on the type of software product you are making, you can measure it in the following ways:
Yes, these indicators seem to have nothing to do with the base level software metrics like bugs found and lines of code produced. However, the problem with bugs found is then you have to grade the severity of the bugs, and refactoring will often reduce your lines of code. Timeliness only matters if you are meeting your customer's expectations of timely delivery.
Concentrate on the business goals. If you have customers buying your software, they don't need a lot of support to use it and they are happy, then your software organisation is successful. No measure of bugs detected, schedule slips or anything else will matter if you don't have those three things in place.
If your software project is like the majority out there, it will be late, over budget, ship with less features than anticipated and have bugs. Don't beat yourself up over these things, deal with them and move on. Yes, you need bug databases, source control, testing and a way of measuring project velocity but in the end if you don't meet the business outcomes then you can't be successful, regardless of how polished and shiny your code is and how few bugs it has.
Update to try to address the revised question
KPIs as you desire to use them are difficult when delivering an intangible product that is also often a moving target. Will your KPIs used this year on an accounting system have the same meaning next year when you are implementing a document management system?
Let's take as an example a profession where KPIs are used widely - lawyers. Measuring lawyers uses KPIs such as: average billed hours worked per day; hours billed per month; age of debtors ledger; average age of unbilled work; percent of billed fees written off; and so on. You should notice a trend here - all these KPIs relate to willingness (or not) of clients to pay for the services rendered. This is the final arbiter of success and is why I suggested (above) some ways you could use this type of measurement as KPIs for your software business.
When you try to get down to having KPIs that don't relate to your client's willingness to pay for the value you are providing, then we get down to problems with what we are measuring, how you are measuring it and what differences there are in the measurement or what is being measured this year as opposed to last year.
"Dollars paid by clients" has a fixed value year to year - arbitrary metrics like "bugs in software", "timeliness of release" and "flexibility" don't have a fixed value and an increase in the KPI may not have a direct relationship to the underlying value that is meant to be measured by the KPI, such as "more bugs means lower quality".
For example, after the Columbia disaster, I recall the investigation team came up with several hundred recommendations and items to be investigated. Did these newly discovered "bugs" mean the space shuttle suddenly had a lot less quality? Actually, after the investigation the space shuttle had more quality. So a KPI around bugs can easily be distorted by an extensive QA session and more bugs reported may actually mean your software has higher quality.
Productivity in terms of timeliness of releases is easily distorted by commercial factors, such as a client throwing money at you to do some custom development for them. Your release schedule will slip but your business will improve.
As for flexibility, I can't even hazard a guess at how you would measure something so intangible.
About the only measurement I can think of that has value this side of the client's wallet is project velocity - how much did we estimate we would do last iteration/cycle/release and how much did we actually get done? Then plug this figure into the time available for the next iteration/cycle/release to estimate how much you will probably be able to get done this time. You can display the time remaining in a burn down chart or similar during the iteration.
The rest comes down to process which I don't think you can pin down to KPIs. All you can do is make sure your developers know what everyone is doing (daily developer meetings), your extended team gets input (weekly or fortnightly team meetings), you understand what worked last time and what didn't (retrospectives) and above all you have transparent effective communication.
Unfortunately, I don't think there are any magic KPIs like you are after (but don't overlook the relevance of getting money from clients as a KPI).
到目前为止,最好的单一指标是“交付和接受的测试功能”。 在敏捷世界中,我们通常根据“用户故事”来衡量“功能”,但它可以采用任何方便的形式,只要它衡量的是交付和测试的实际功能,并且客户可以接受。
通常的其他措施,如 SLOC、SLOC/员工工时等,由于查理的第一管理定律而失败,该定律是:
将您的措施设置为 SLOC,您将获得大量 SLOC。 使用 SLOC/hr,你会得到很多 SLOC/ht。 给他们加班奖金,你会得到很多加班费。
哦,还要记住相关性:
如果你没有得到你想要的,问问为什么做那些正在完成的事情是有回报的。
By far the best single indicator is "tested functionality delivered and accepted". In the Agile world, we usually measure "functionality" in terms of "user stories" but it can be in any convenient form as long as it's measuring actual functionality delivered and tested, acceptable to the customer.
The usual other measures, like SLOC, SLOC/staff-hour, etc, fail because of Charlie's First Law of Management, which is:
Set up your measures as SLOC, and you'll get lots of SLOC. Use SLOC/hr, you'll get lots of SLOC/ht. Give them bonuses for working overtime, you'll get lots of overtime.
Oh, and remember the correlary, too:
If you're not getting what you want, ask why it's rewarding to do the stuff that's getting done.
Benno,我正在回答你的评论,但没有足够的字符来回答。
这取决于您要解决的问题。 例如,假设问题是从开发人员签入代码到实际投入生产的时间似乎太长。 然后您将获得所需时间的基线测量值。 然后你会投入你的零钱,然后测量一段时间,看看现在是否需要更少的时间。 您还可以检查解决方案被确定不起作用并在前后返工的次数,以确保解决方案不是更快而是质量较低。
现在,IT 部门在进行这些测量时遇到的问题是,可能需要相当长的时间才能积累足够的数据,因为某些问题不会经常再次发生。 在这种情况下,您可能必须首先依赖主观数据,直到您能够积累足够的数据来了解更改是否良好。 但在用户习惯之前不要问某件事是否是改进。 新流程的第一周或第二周,你会遇到变革阻力,因此如果你问得太早,就会得到不好的主观结果。
另一件需要警惕的事情是,如果人们知道你正在衡量某些东西,他们会担心他们的个人表现正在被衡量,从而会利用系统来获得好的结果。 如果您可以根据现有的某些系统进行测量,通常是最好的(我们有一个管理软件更改请求的系统,我们可以查询数据库以了解历史上有多少请求错过了截止日期,有多少请求在被拒绝后重新开放)已关闭或与过去的请求等相关,开发人员完成和代码移至生产等之间的时间差异是多少)。 您可能还必须考虑消除严重的异常值,特别是如果时间跨越新旧系统的时间段。 例如,我们有一个请求在质量检查中已经存在了 100 多天,并不是因为它很糟糕,而是因为质量检查存在可用性问题,而这个请求的优先级最低,因此它不断被优先级更高的工作所取代。 这个时间对于衡量时间的改进没有价值,因为导致时间如此长的因素并不是您要修复的过程。 如果将数据绘制成图表,您将很容易看到可能需要排除的异常值。
Benno, I'm answering your comment but didn't have enough characters there for the answer.
This depends on the problem you are solving. For instance suppose the issue is that the time from when the developer checks in the code until it is actually placed in production seems too long. Then you would get a baseline measurement of how long it is taking. then you would put in your change and then measure for a period of time to see if it now takes less time. You might also check something like the number of times the solution was determined not to work and sent back for rework before and after as well to make sure that the solution is not faster but lower quality.
Now the problem in IT with these measurements is that it may take quite some time to accumulate enough data as some problems do not re-occur frequently. In this case you may have to start by relying on subjective data until you can accumulate enough data to know if the change was good or not. But do not ask if something is an improvement until the users have gotten used to it. The first week or two of a new process, you will meet resistance to change and thus will get bad subjective results if you ask too early.
Another thing to be wary of is that if people know you are measuring something, they will be afraid that their personal performance is being measured and thus will game the system to get good results. It is often best if you can take measurements based on some system already in place (we have a a system that manages requests for software changes, we could query the database to find out historically how many requests missed the deadline, how many we reopened after being closed or are related to past requests etc., what the time differential between developer finishing and code being moved to production etc. are). You may also have to consider eliminating severe outliers, especially if the time spans the time period of both the old and new system. For instance we have one request that has been in Qa for over a 100 days not becasue it is bad but because QA has an avaliability problem and this one is lowest priority so it keeps getting bumped for higher prioroity work. This time would not be valuable in measuring the improvement of time, becasue the factor making the time so long isn't the process you are trying to fix. If you graph the data, you will easily see the outliers that might need excluding.
以成本、质量和成本为基础制定 KPI 时间表将是一个好的开始。 考虑一下您想要测量的每个属性的属性是什么。
能够分解这些措施中的每一个来显示错误的成本将是有用的 - 项目后期的大量错误修复工作意味着成本/进度井喷。 能够分析代码库的哪些部分有错误可以帮助定位额外的测试和修复。 可能的代码重写——通常 80% 的错误来自 20% 的代码。 知道它在哪里可以让您的团队更好地集中注意力。
编辑:查看质量成本 (CoQ) 和不良质量成本 (CoPQ) 等指标。
像生产力这样的衡量标准总是很难量化——例如,使用 LOC/天会引发关于一行代码到底是什么的争论。 如果开发人员不理解为什么这些东西被跟踪或将它们视为个人测量,它还可能导致愚蠢的代码格式化以“提高”生产力。 即使 LOC/day 不是在开发人员级别进行衡量的,您仍然可以通过团队竞争来获得相同的结果。
编辑:在 Steve McConnell 的 Construx 网站上可以找到一些很好的讨论。 [是的,这就是《Code Complete》中的史蒂夫·麦康奈尔]
Basing your KPIs around Cost, Quality & Schedule would be a good start. Consider what are the attributes for each of those you want to measure.
Being able to split each of these measures to show the cost of bugs would be useful - lots of bug fix effort late in the project means cost/schedule blowout. Being able to profile which parts of the codebase are buggy could help in targeting additional testing & possible code re-writes - typically 80% of the bugs will come from 20% of the code. Knowing where that is will allow your team to focus better.
EDIT: Look at measures like Cost of Quality (CoQ) and Cost of Poor Quality (CoPQ).
Measures like productivity are always hard to quantify - for example, using LOC/day leads to a debate about what exactly is a line of code? It can also lead to silly code formatting to "boost" productivity if the developers don't understand why these things are being tracked or perceive them as personal measurements. Even if LOC/day is not measured at the developer level, you can still get team rivalry leading to the same result.
EDIT: There are some good discussions to be found on Steve McConnell's Construx website. [Yes, that's the Steve McConnell of Code Complete fame]
除了让每个人真正聚集在一起并弄清楚什么有效、什么无效之外,没有任何流程可以帮助您改进所做的事情。 对于我目前领导的团队,我们通过一系列回顾来做到这一点(其中我强烈推荐 这本书)。 团队通常知道他们想要改进哪些部分——诀窍是赋予他们实际衡量和改进这些东西的权力。
是的,您当然仍然需要有人关注宏观层面。 如果你看看像丰田这样的组织,他们有一位横跨业务和生产之间的首席工程师(有关详细解释,请参阅 Scott Bellware 的 博客文章)。 在我们的组织中,我们有类似的人 - 我的老板是近 20 年前我们产品的最初开发人员之一,在技术方面保持高度活跃,但也在客户方面投入了大量资金。 我的工作还包括从整体上审视团队并提出改进建议。
为了衡量,我们首先确保我们努力追求的任何改进都是我们的团队实际上可以改变的,然后使用类似于 智能目标,以便任何改进都是可衡量的。 我们有一堵可见的大墙,我们将回顾笔记发布在上面。 这里恰好也是我们举行每日站立会议的地方,因此它让我们能够专注于正在发生的事情。
为了将统计数据汇总到我们的执行会议上,我们关注的是代码交付,而不是交付的代码行。 我故意让团队不再使用模糊单位进行测量,这意味着我们不会报告我们的情况工作了 x 小时或天,或者其他什么。 他们看到的是我们提供功能的情况以及我们如何改进的趋势图。 当团队觉得他们想要分享时,我们还将添加有趣的花絮。
所有这一切最好的部分是我们可以尝试一个月,然后在 4 周后重新评估。 这为尝试新事物创造了更低的门槛,因为团队知道如果它影响了他们,我们要么立即取消它,要么我们将在下一次回顾中重新评估并找到更好的方法。
不好的部分是它不正是您正在寻找的东西。 我们没有持续遵循一个或一组指标。 我们始终关注趋势,并衡量我们认为有趣的趋势 - 但只是一点点,并且仅当团队因此而着手实现特定目标时。 但总的来说,我对它的工作方式非常满意,并且我看到团队在改进流程的参与度方面有了显着的进步。 我们还没有完全Kaizen,但我们每天都在进步。
No process is going to help you improve what you do except actually getting everyone together and figuring out what is working and what isn't working. For the team I'm currently leading, we do this through a series of Retrospectives (of which I'd highly recommend this book). The teams generally know what parts they want to improve - the trick is giving them the empowerment to actually measure and improve those things.
Yes, you certainly still need someone looking at a Macro level. If you look at an organization like Toyota, they have a Chief Engineer who straddles that line between business and production (for a good explanation, see Scott Bellware's blog post). In our organization we have someone similar - my boss was one of the initial developers of our product nearly 20 years ago, and stays highly active in the tech side, but is heavily invested in the customer side as well. My job is also to look at the teams as a whole to suggest improvements.
To measure, we first make sure that any improvements we strive for are things our teams can actually change, and then use something akin to the SMART goals so that any improvements are measurable. We have a Big, Visible Wall which we post the notes from the retrospective on. This happens to also be where we hold our daily stand-ups, so it gives us focus on what is going on.
For rolling up statistics to our executive meetings, we focus on code delivery - not lines of code delivered. I purposely kicked the team off measuring in nebulous units meaning that we don't report up that we worked x number of hours, or days, or whatever. What they do see is a trending chart of how well we are delivering our features and how we are improving. We'll also include interesting tidbits when the team feels they want to share it.
The best part about all of this is that we can try things for a month, and then reevaluate it just 4 weeks later. This creates a much lower barrier to entry for trying new things, since the team knows that if it is impacting them, we'll either cancel it immediately, or we'll reevaluate and find better ways at the next retrospective.
The bad part is that it isn't exactly what you are looking for. There isn't one metric or set of metrics we continually follow. We watch trends at all times, and measure the ones that we think are interesting - but only for a little bit, and only when the team is setting out to achieve a specific goal because of them. But in general, I'm quite happy with how it works, and I've seen a marked improvement in the involvement of the teams in improving the process. We aren't quite Kaizen, but we're getting better every day.
我专业从事流程改进14年。 这是我的建议,停止尝试量化并开始与人交谈。 测量对于特定问题(一旦您了解问题,您就可以更好地了解要测量什么)以及可重复的流程(例如制造)非常有效。 您的员工确切地知道问题所在,您的客户和用户也知道(从非常不同的角度来看)。 流程图(使用工业工程符号而不是计算机编程符号)列出了存在问题的区域的实际过程(不是我们假装的过程,您需要观察并提出问题)。 一旦您看到流程的整个流程,就会发现延迟、重复工作的区域、存在不必要流程的区域(通常是由于在流程中添加了更多步骤来解决人为错误,从而为人类创造了更多潜在的领域)错误)。 质疑每个步骤的必要性以及是否有更好的方法来完成每个步骤。 测试潜在的变化,看看它们实际上是否是一种改进(很多时候它们使情况变得更糟而不是更好)。 在任何情况下都不要只在感觉到问题或绘制流程图时才与经理交谈。 你不会得到真实的图片,因此会解决错误的问题。
I did process improvement professionally for 14 years. Here is my advice, stop trying to quantify and start talking to people. Measurement works fine for a specific problem (once you know the problem, you have a better idea what to measure) and for repeatble processes like manufacturing. Your people know exactly where the problem areas are, so do your customers and users (from a very different perspective). Flow chart (use industrial engineering symbols not computer programming symbols) out the actual process for areas where there are concerns(not what we pretend the process is, you will need to observe as well as ask questions). Once you see the whole flow of the process look for delays, areas where the work is duplicated, areas where there is unnecessary process (usually due to adding more steps to the process to account for human error, thus creating many more potential areas for human error). Question the need for each step and whether there is a better way to do each step. Test potential changes and see if in fact they are an imporvement (way too many times they make the situation worse not better). Do not under any circumstances only talk to managers when getting a feel for problems or when flow charting. You will not get a true picture and will thus solve the wrong problem.
了解浪费和价值流图将向您展示需要改进的地方,并且从这些知识中您将了解需要衡量的内容。 精益和看板的原则在这里适用。 了解浪费及其对生产软件的影响将帮助您走上特定的改进之路,这不可避免地针对您的组织。 你不能采取千篇一律的方法。 阅读(或聆听)“目标”和“精益思维”,了解更多有关错误和解决方法的令人惊叹且令人大开眼界的观点。
Understanding waste and value-stream mapping will show you where you need to make improvements, and from that knowledge, you will learn what you need to measure. Principles of Lean and Kanban apply here. Understanding waste and it's effects on producing software will start you down the specific path to betterment that is inevitably specific to your organization. You can't take a cookie-cutter approach. Read (or listen to) "The Goal" and "Lean Thinking" for more on this really amazing and eye-opening perspective of what's wrong and how to fix it.
关键绩效指标的最佳用途是驾驶(或转向,如果您愿意的话)。 用于实时修正。
(有关此子主题的更多内容,请参阅仪表板用于驾驶。 :我是这篇喋喋不休的文章的作者。)
所以,回到你身上的问题是:你是否试图在事后评估绩效,而此时已经太晚了,无法采取任何行动,或者您是否正在尝试寻找能够帮助您坚持到底的 KPI?
如果是前者,您的组织关心的任何指标(错误数量、交付日期延误、带注释的代码行数、客户退货百分比等)都可以。 在运输产品和升级之间进行衡量,祝你好运;-)
如果是后者,请选择 velocity< /a>. 当然,假设您正在使用测试驱动开发(TDD)。
编辑:所以是前者。 好吧,这就是您可能不走运的原因:
假设您认为通过衡量客户报告的错误数量作为您的后处理 KPI 来最好地量化“质量”。 假设您正在使用 TDD,并假设您的团队在 6 个月内交付了产品#1,在现场工作 6 个月后,您发现有 10 个客户报告的错误。 那么现在您到底要做什么来改进您的流程呢? 测试更多? 专门测试更多事情,例如报告的错误的原因? 在我看来,您似乎已经在进行测试,并且当发现错误时(无论是否由客户发现),您可以添加针对特定错误的回归测试和附加单元测试,以确保不再出现类似的错误。 换句话说,您的流程后改进响应将与您的流程中改进响应没有什么不同,因此该 KPI 对于改进您的流程实际上没有任何重大帮助。 关键是,无论 bug 是在发布后 6 个月还是编码后两天发现的,改进流程的方式都保持不变。 因此,虽然这可能是挂在经理墙上或部门通讯上的闪亮 KPI,但它实际上不会改变您的流程改进机制。 (并且要注意不要过多关注此 KPI,因为它可能会受到您无法控制的因素的极大影响!)。 简而言之,了解错误数量并不能帮助您改进。
(这里还有另一个危险,不仅在商业中而且在军事中都很常见,那就是事后分析揭示了有价值的信息的错觉,因此事后吸取的教训被大力应用到下一个项目中, 这可能与上一个项目不同,这被称为“打最后一场战争”。)
假设客户退货/退款的数量是您选择的“质量”KPI - 如果这是这样数字是 5,这告诉你什么? 客户要求退款的具体原因可能是一些质量问题的迹象(“太慢”、“不与XYZ系统对接”等),但仅此类事件的数量就足以说明问题你什么都不是。 与预期回报百分比的差异可能会告诉您质量是否在提高,但同样这个数字并不能帮助您提高。 您需要的信息超出了该号码所能提供的信息。
那么对于“发布的及时性”,什么衡量标准是合适的呢? 发货日期推迟了多少天? 基于最初估计的超支百分比? 这并不重要,因为这些数字并不能帮助你提高。
如果您可以在产品完成后测量“生产力”,那么您可能可以在产品开发过程中测量它(例如速度),不同之处在于,在开发过程中低于预期的生产力可以立即提高,而测量的总体生产力数字开发完成后太粗糙,太平均,没有任何用处。 人们只能猜测为什么 6 个月后它低于预期......
我不知道如何衡量“灵活性”,这听起来像是营销术语;-)
我希望我没有把这颗钉子敲得太重或太重到目前为止,但我不认为有任何有用的可以在事后测量,而在进行中是无法测量的。 而且有很多事后测量在不知道原因的情况下是没有用的。
The best use for Key Performance Indicators is for driving (or steering, if your prefer). For course-corrections in real-time.
(See Dashboards are for Driving for more blather about this sub-topic. Caveat: I am the author of the blathering article.)
So, the question back to you is: are you trying to evaluate performance after the fact, when it is too late to do anything about it, or are you trying to find KPIs that can help you stay on course?
If the former, any metric your organization cares about (bug count, ship-date slippage, lines of code with comments, customer return percentages, etc.) will be fine. Measure away and good luck getting any better in between shipping products and upgrades ;-)
If the latter, choose velocity. Assuming you are using test-driven development (TDD) of course.
EDIT: so it's the former. Well, here's why you are probably out of luck:
Suppose that you decide that "quality" is best quantified by measuring the number of bugs reported by customers as your post-process KPI. Let's assume that you are using TDD, and say that your team delivers Product #1 in 6 months, and after 6 months in the field you find that you have 10 customer-reported bugs. So now what, exactly, are you going to do to improve your process? Test more? Test specifically for more things like the causes of the bugs that were reported? It seems to me that you would already be testing, and when bugs are discovered - whether by the customer or not - you add a regression test for the specific bug and additional unit tests to make sure that there are no more similar bugs. In other words, your post-process improvement response will be no different than your in-process improvement response, so this KPI is really of no significant help in improving your process. The point is that the way you improve your process remains the same regardless of whether the bugs are discovered 6 months after release or two days into coding. So while this might be a shiny KPI to put on a manager's wall or a departmental newsletter, it really will not change your process-improvement mechanisms. (And beware of putting too much stock in this KPI because it can be wildly influenced by factors beyond your control!). In short, knowing the number of bugs does not help you improve.
(There is another danger here, one commonly found not just in business but also in the military, and that is the illusion that the post-mortem analysis revealed valuable information, so the lessons learned post-mortem are vigorously applied to the next project, which is probably not the same as the last project. This is known as "fighting the last war".)
Suppose the number of customer returns/refunds is your KPI of choice for "quality" - if this number is 5, what does this tell you? The specific reasons why customers requested a refund may be some indication of quality problems ("too slow", "doesn't interface with XYZ system", etc.), but the mere number of such incidents tell you nothing. A variance against an expected return percentage might tell you if quality was improving, but again the number does not help you improve. You need more information than the number can give you.
So for "timeliness of releases" what measurement would be appropriate? Number of days of ship-date slippage? Percent overrun based on original estimates? It doesn't matter, because again the numbers do not help you improve.
If you can measure "productivity" after the product is done then you can probably measure it while the product is being developed (e.g. velocity), the difference is that productivity less than expected during development can be improved immediately, while an overall productivity number measured after development is completed is too gross, too averaged, to be of any use. One could only guess at why it was lower than expected 6 months later...
I have no idea how one might measure "flexibility", that sounds like marketing jargon ;-)
I hope I haven't pounded this nail too hard or too far, but I don't think that there is anything useful that you can measure after-the-fact that you cannot measure while in progress. And there are a lot of after-the-fact measurements that are useless without knowing the causes.
您可以在 仪表板 中获取有关 KPI 和示例的大量想法。 dashboardzone.com" rel="nofollow noreferrer">http://www.dashboardzone.com
它具有按行业和功能区域划分的 KPI。
You can get lot of ideas about KPIs and Examples of Dashboards at http://www.dashboardzone.com
It has kpis by industry and functional areas.